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Meta-analysis seeks to combine the results of several experiments in order to improve the 
y—i accuracy of decisions. It is common to use a test for homogeneity to determine if the re- 

, , suits of the several experiments are sufficiently similar to warrant their combination into 

an overall result. Cochran's Q statistic is frequently used for this homogeneity test. It is 
often assumed that Q follows a chi-square distribution under the null hypothesis of homo- 
geneity, but it has long been known that this asymptotic distribution for Q is not accurate 
for moderate sample sizes. Here we present formulas for the mean and variance ofQ under 
the null hypothesis which represent 0(l/n) corrections to the corresponding chi-square mo- 
^ ments in the one parameter case. The formulas are fairly complicated, and so we provide a 



> program (available at http://www.imperial.ac.uk/stathelp/researchprojects/metaanalysis) 

for making the necessary calculations. We apply the results to the standardized mean dif- 
ference (Cohen's d-statistic) and consider two approximations: a gamma distribution with 
estimated shape and scale parameters and the chi-square distribution with fractional de- 
grees of freedom equal to the estimated mean of Q. We recommend the latter distribution 
as an approximate distribution for Q to use for testing the null hypothesis. 
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1 Introduction 



In the meta-analysis of several studies, it is usual to conduct a "homogeneity test" to 
determine if the effects measured by the studies are sufficiently similar to warrant their 



combination into one grand summary effect using the fixed effect model, Normand (1999). 



The most commonly used test statistic is Cochran's Q (Cochran, 1937). It is defined as 



follows. Suppose that there are I studies (or experiments) each of whose result is given 
by an estimator 6^ of a population effect 0j. Suppose that the variance of 9i is given by 
Vi which can be estimated in turn by -Oj. The summary effects may be combined into a 
grand summary effect using a weighted average 6 W = J2iWi9i/ where the weights 

Wi and their estimators Wi are usually taken as inverses of the variances and their esti- 
mators respectively (thus weighting more accurate studies more heavily). At this point 
of the discussion, the summary effect may be quite general, such as the sample mean of 
each study, the difference of means between treatment and control arms of each study or 
the standardized difference of means between treatment and control arms of each study; 
but in the main body of the paper we will restrict the discussion to cases in which the 
estimators of Q{ and Wi depend on only the one parameter 



Cochran's Q statistic, which is used in the homogeneity test, is defined by Q = 
^2iWi(8i — 6 W ) 2 . When testing the null hypothesis that 0i = • • • = 0j, that is the un- 
derlying effects measured by all the studies are the same, it is common to assume that Q 
has a chi-square distribution with I — 1 degrees of freedom. This distribution appears to 
be asymptotically valid (as the sizes n« of the studies become large) over a wide choice 
of summary effects. There have been many simulation studies of the accuracy of the chi- 
square approximation (see Hedges & Olkin (1985), Viechtbauer (2007) and the references 



therein), but except for the case where the populations are normally distributed with the 
parameters estimated by sample means and sample variances, there are few theoretical re- 
sults dealing with the question of the distribution of Q for small or moderate sample sizes. 



The chi-square distribution is an exact distribution of the Q statistic for normally 
distributed populations with known variances resulting in known weights. Randomness 



of the weights is traditionally ignored in meta-analysis, Biggerstaff & Tweedie (1997) 



Jackson (2006), Biggerstaff & Jackson (2008). In contrast, Cochran, as early as in his 
1937 paper which dealt with the normally distributed case, recognized the need for a 
correction to the chi-square distribution for moderate sample sizes and proposed such a 



correction at that time. In 1951, James (1951) and Welch (1951) proposed separate im- 



proved corrections to the distribution of Q (again for the normal case), corrections which 
are equivalent to each other up to order \/n%. Welch's proposal (more commonly used 
and now known as the Welch test) referred Q to a rescaled F-distribution (cF/_i )I/ ) with 
J — 1 and v degrees of freedom where v and the rescaling constant c are quantities to be 
estimated from the data. In Welch's derivation, the properties of normality including the 
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independence of the estimators of the weights (inverses of sample variances) and of the 
effects (sample means) was heavily used; these properties are not generally valid in many 
situations in which the Q statistic is commonly used. Improved approximations to power 



of the Welch test in the normal case are given in Kulinskaya et al. (2003). The paper 



Kulinskaya et al. (2004) extended the Welch test in the normal case to contrasts (such 
as the difference of treatment and control means), and a Welch type Q test for robust 



estimators of effects and their variances was introduced in Kulinskaya & Dollinger (2007). 



In a series of papers, we plan to investigate corrections to the distribution of Q in 
situations in which the estimators of the effects and of the weights are not statistically 
independent. As far as we know, there have been no theoretical results before now on this 
subject. We expect that the results will provide more accurate homogeneity tests when 
the sample sizes are small or moderate. In this paper (the first of the planned series), we 
investigate the situation in which the effect and weight estimators depend on a single pa- 
rameter. We will apply our general theory to an important special case: the standardized 
mean difference (also known as Cohen's d, Cohen ( |1988 )). Definitions appear in Section [3j 



This paper is organized as follows. In Section [2j we present the general theory. In 
Section [3] we apply the general theory to the standardized mean difference. Section [4] con- 
tains two real meta-analytic examples which have used the standardized mean difference 
to measure the effects. Section [5] contains the results of a large number of simulations 
which show the quality and the limitations of the new approximations for the homogeneity 
test based on Q when the effects are measured as standardized mean differences. In the 
final section we summarize the more important conclusions, make some recommendations 
and indicate areas of future work. Some of the more complicated formulas have been 
relegated to the Appendix. 



2 The general theory 

Welch's 1951 correction to the distribution of Q was based on expansions to approximate 
the mean and variance of Q. He then used these moments to define an approximate 
distribution for Q. We follow this same general idea, but there are several important 
differences. Welch made the assumption that the underlying distributions were normally 
distributed and that the weights were inverses of the study variances, estimated by the 
sample variances. To permit as wide applicability as possible, we do not assume normality 
and allow the weights to be different from the inverses of the variances. Also we do not 
make the assumption that the estimators of the weights are statistically independent of 
the estimators for the effects. A third difference between our approach and that of Welch 
is that he based his approximations on an asymptotic expansion of the moment generating 
function of Q. We instead use the delta method, which is based on Taylor expansions of 
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Q and Q 2 about the mean of the effect size. 



2.1 Notation and assumptions 

There are / studies with corresponding effect 0j. The null hypothesis for the homogeneity 
test will be equality of the effects, i.e., 9\ = ■ ■ ■ = 9j\ we will denote the common effect 
under the null hypothesis by 9. The effects are estimated by random variables The 
theoretical weights are Wi and they are estimated by Wi. In most applications, we will 
have Wi = 1/Var[#j], but in this section we merely assume that the weight estimators are 
some function fi of the effect estimator 9f, that is, Wi = fi(9i) where the functions fi 
will generally depend on additional constants such as the sample size. The theoretical 
weights under the null hypothesis will be = fi(9). The assumption that the weights 
are dependent only on the corresponding effects is an important limitation of the results 
of this paper. In our next paper in this series, we plan to investigate the situation in 
which the weights depend on more than one random variable. 

We need to make some fairly standard assumptions about the orders (relative to the 
sample sizes) of the central moments E[(#j — 9i) r ] of the estimators 9i and also the orders 
of the weights and their derivatives. Let rii represent the sample size of the ith study. In 
the event that the studies have two arms (as in the application in Section [3]), let rii be the 
minimum sample size of the two arms. We will also use the notation n = minjrij} and 
sometimes express approximations in terms of orders of n. 

To simplify notation, define 6^ = (6^ — 6^). We assume first that E[0J = 0(l/n 2 ). This 
condition will certainly be satisfied if the estimator 6^ is unbiased. In regular parametric 
problems, it is easy to remove the first-order term from the asymptotic bias of maximum 



likelihood estimates (see Firth (1993)). We will need higher moments up to and including 
the sixth central moment. For these moments, we assume the following orders which 
generally follow from ^/nl asymptotic normality: E[6f] = 0(l/nj), E[©?] = 0(l/n 2 ), 
E[Qf] = 0(l/n 2 ), E[6f] = 0(l/nf) and E[6f] = 0(l/nf). We further assume that the 
weight estimators Wi and their first two derivatives with respect to 9i will be 0(rii), as 
will be the case whenever the weights are inverses of the variances. 

2.2 Expansions for E[Q] and E[Q 2 ] 

In this section we present expressions for E[Q] and E[Q 2 ] using Taylor expansions and 
then taking expectations of these expansions. The Taylor expansions are centered about 
the the null hypothesis 9\ — • ■ ■ — 9i — 9, and thus all derivatives in this section are to 
be evaluated at this null hypothesis. In our expansions we have kept all terms to order 
0(1 /n). We begin with the first moment of Q. 
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We next substitute expressions for the indicated derivatives into this formula and 
expand the double sum into combinations of single sums to obtain the following result. 
To simplify the expression, we use the notation W = J2i w i an d U% = 1 — Wi/W. The 
formula is expressed in terms of parameter values; estimates of these parameter values 
will be needed when the formula is applied to data. 
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The expansion for second moment E[Q 2 ] up to order 0(l/n) requires terms of 4th, 
5th and 6th degree. The expansion is given by 
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The derivatives of Q 2 needed for Equation [3] are fairly complicated and appear in the 
Appendix. 



2.3 Applying the formulas 

The formulas in Equations [2] and [3] are fairly general since they are not based on any 
normality assumptions, and will be applicable to any situation in which there is only 
one parameter and in which the weights and central moments meet the order conditions 
described in Section 12.11 



To use the formulas for a specific application, the user will need to supply expressions 
for the weights (that is, the functions /j) and their first and second derivatives and also 
expressions for the central moments E[0£] for r = 1, ... ,6. We provide an illustration 
in the next section where we apply the theory to the important special case of the stan- 
dardized mean difference. Because of the complexity of the formulas, we have provided a 
computer program in R which can be used for the necessary calculations for applying the 
Q test to the standardized mean difference. This program can be downloaded from the 
website http:/ /www. imperial. ac.uk/stathelp/researchprojects/metaanalysis 

The weights and their derivatives which appear in the formulas need to be estimated 
under the null hypothesis and will be different from the weights which are used for cal- 
culating a specific value of Q from the data. Specifically, weights Wi = fi(9i) are first 
calculated. These weights are used to estimate the combined effect 6 W = J^iU'i^i/ Hi^i 
and to calculate the value of the Q statistic J2iWi(6i — 9 W ) 2 . However, the weights which 
appear in Equations [2] and [3] need to be recalculated using the same combined effect O w 
as the effect for each of the studies. That is, these 'null' weights are estimated by fi(9 w ) 
and the derivatives will be estimated by W:(0 W ) and i U^(6 w ). 

% i 

Improved approximations to the mean and variance of Q under the null hypothesis are, 
of course, not sufficient to conduct a test of the null hypothesis. A distribution for Q is 
needed for this purpose. Ideally, simulations should be used for each separate application 
type to select a family of distributions which fits the distribution of Q. However, we have 
found in our simulations, which cover a number of situations (including both the one 
parameter case discussed here as well as in cases involving multiple parameters), that the 
gamma family of distributions fits the null distribution of Q quite closely. Importantly, 
this family includes the chi square family as a special case. In particular, we have found 
that the gamma family of distributions fits the distribution of Q very well in the case of 
the effects are measured by the standardized mean difference. Another contender is the 
chi-square distribution with fractional degrees of freedom equal to the mean of Q (see 



Section 3.4 below) 



2.4 Inverse variance weights and the chi-square distribution 

It is usual to choose weights to be inverse variances, i.e., Wi = 1/E[9f]. We make this 
assumption in the remainder of this section. The expressions for the moments given 
in Equations [2] and [3] simplify somewhat under this inverse variance assumption. In 
Equation [2j only the first (or quadratic) term is 0(1). The remaining terms are all 0(l/n). 
With inverse variance weights, this first term simplifies to — Wi/W) =1 — 1. Notice 
that this quantity is the first moment of the chi-square distribution with I — 1 degrees 
of freedom. Thus Equation [2] provides an order 0(l/n) correction to the chi-square first 
moment. 
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In Equation [3] for the second moment of Q, the lowest degree terms are the first two 
terms (those of fourth degree), and these are the only two terms of order 0(1). The 



remaining terms are all of order 0(l/n). Using Equations 20 and [21] (in the Appendix) 
for the fourth derivatives of Q 2 , these two terms become 



(4) 



The kurtosis 72 of a random variable with fourth central moment /Z4 and variance a 2 
is commonly defined by 72 = /^4/cr 4 — 3; this definition is arranged so that normally 
distributed random variables have kurtosis of zero. Using this definition, we will denote 
the kurtosis of §i (the estimator of the ith. effect) by 72^. Then Equation [4 can be 
algebraically rearranged to 



i 2 -i + ^^u 2 . 



(5) 



Since kurtosis is typically of order 0(l/n), we see that the second moment of the null 
distribution of Q agrees with the second moment of the chi-square distribution with J — 1 
degrees of freedom (which is I 2 — 1) up to order 0{l/n). 

Thus when inverse variance weights are used, both the first and second moments of 
the null distribution of Q agree with those of the chi-square distribution up to order 0(1) 
and Equations |2] and [3] provide order 0(l/n) corrections. 



When discussing the distribution of Q, some authors make the simplifying assumption 



that the weights are constants rather than random variables. See, for example Biggerstaff 



& Tweedie (1997), Jackson (2006), Biggerstaff & Jackson (2008). When this assumption 



of constant weights holds, the derivatives of the weights become zero and all terms of our 
approximate formula for E[OJ vanish except for the first (or chi-square) term. Similarly, all 
terms for E[Q 2 ] vanish except for the first two terms. Accordingly, under the assumption 
that the weights are known constants, the commonly used chi-square approximation for 
Q has mean which is accurate to order 0(l/n). But the second moment is accurate to 
this order only when the estimators of the effects have kurtosis of order less than 1/n. 
However, since in reality the weights are random, both the mean and variance of Q need 
the corrections given by our formulas in order to be accurate to order 0(l/n). Thus use 
of our formulas should yield improved accuracy in the Q test when n is not too large. 



3 The Q test for the standardized mean difference 

In this section, we apply the theoretical results of the previous section to the standardized 
mean difference (also known as Cohen's rf-statistic). We begin with notation and a brief 



review of the necessary background. See, for example, Hedges & Olkin (1985) for details. 
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3.1 Notation and weight functions 

We assume that each of I studies consists of two arms of sizes n^i and riQi having normally 
distributed data with means fiTi and [La and that the variance of is the same in each 
arm. (The subscripts T and C may be thought of as treatment and control.) Then the 
effect measured by the standardized mean difference in the ith study is given by 



Si = (jlTi - HCi)/Vi- 

A natural, but biased, estimator of S is 

Si = {Xxi — Xci)/spi 



(6) 



(7) 



where is the usual pooled variance estimator. Instead of using Si, we follow the usual 
practice to correct for the bias by using the unbiased estimator of S defined by 



where 



Ji 



§i — JiSi — Ji{Xxi — Xci)/ s P i 
r[(JVi-2)/2] 



y/W ~ 2)/2 r[(JVi - 3)/2] 



(9) 



is a constant depending only on the total sample size Ni = nn + nci- Define ^ = na/Ni 
to be the proportion of the total sample size in the control arm of the ith study. It is 
known that (see ( |Hedges & 01kin[ [l98Hj , p. 104-5)) 

(Ni-2)Ji ({Ni-2)Jf 



(Ni-^Niqiil-qi) 



+ 



Ni-i 



IS 



Ai + BiSf, 



(10) 



where the constants Ai and Bi depend only on the sample sizes. Replacing Si by its 
unbiased estimator in this variance formula, we obtain an estimator of the variance of 
gi which is given by 

Var[&]=A + A& 2 - (11) 

Then the functions /j giving the estimated inverse variance weights in the Q statistic are 
given by 



W; 



Ai + Big] 



The first and second derivatives of Wi are given by 

dfi 



dgf 



-IB^iwl 
^Bitf + SB^wl 



(12) 

(13) 
(14) 
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One issue that has arisen in meta-analysis involving the standardized mean difference 
is how best to estimate the combined effect 5. Estimators of 5 appear in two places in 
the Q test: in the definition of Q; and in the application of Equations [2] and [3] where an 



estimated value of 5 under the null hypothesis is used. It is known (see Yuan & Bushman 



(2002 )) that the natural weighted sum estimator g w — YJ Wigi/J2 Wi is slightly biased. An 
alternative choice is to use the estimator §a = Ai9i/ J2 \\ since the weights A± are not 
random, the estimator g& is unbiased. We explored both choices in our simulations of the 
Q test and found that the difference between these two choices is barely noticeable and 
not of practical importance. We use the estimator g w in the examples of Section |4| 

3.2 The moments of g 

In this section we suppress the subscript % on all variables pertaining to the ith study. 
The two main ingredients needed for applying the formulas for E[Q] and E[Q 2 ] are first 
the weight functions and their derivatives (given in the previous section) and second the 
central moments E[(g — 5) r ] for r = 1, ... ,6. We provide these central moments in this 
section. For these moments to exist, we assume that N > 8. (We note that for the 
usual chi-square approxim ation to hold, N > 4 is required just for the variance of g to 



exist.) It is known that ((Hedges & Olkin, 1985, p. 79)) J(Nq(l — q)) 5 has a non- 
central t-distribution with iV — 2 degrees of freedom and non-centrality parameter equal 



to y (Nq(l — q)) 5. To simplify notation, write 7 = y (Nq(l — q)) 5 for the non-centrality 
parameter. 

Denote a random variable with a non-central t-distribution with N — 2 degrees of 
freedom and non-centrality parameter 7 by £^-2(7)- Then from (Johnson et al. 1995, p. 
512), the moments of £^-2(7) about zero are given by 

pr.r M , fN-2V^ T[^] ^ (r\{2j)\ 
The first moment of ^-2(7) w iU be denoted by fi t and is given by 



iv-2x 1 / 2 rf^] 
rp 



*=F^ ^7 (16) 



2 



Then the central moments of £^-2(7) are given by 




E[(t*- 2 (7) - = EC" 1 )*! J^E[^ 2 (7)] (17) 

fe=0 
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Since v^^ 1 q )l ^ Yiq,s the distribution ^-2(7), we then have the desired central 
moments needed for the formula for Q. These are 



j 



Vwi-«)), 



E[(t JV _ 2 ( 7 )-/i t ) 



3.3 Verifying the order conditions 

One further step in applying the formulas for E[Q] and E[Q 2 ] is to check the order con- 
ditions which are set out in Section 2.1 Recall that we use the notation Ni to represent 



the sum of the sizes of the two arms of the ith study and that we use the notation ri{ to 
be the minimum of the two sizes, with n = minjnj}. It is evident from the definition in 
Equation 10 that Ai = 0(l/n). Also Bi (as defined in Equation 10) is 0(l/n); see Hedges 
& Olkin (1985) for this fact. Thus Wi = and its derivatives are 0(n). Further, since 



gi is unbiased, the order condition for the first central moment of gi is trivially satisfied. 

In the remainder of this paragraph, we again suppress the subscript % on all variables 
pertaining to the ith study in order to simplify notation. Let X denote a normally 
distributed random variable with mean 7 and variance 1, i.e., X ~ N( 7 , 1). Then the kth 
moments of the noncentral ^-2(7) distribution are related to the moments of X by 



T[(N-2- k)/2}(N -2) k l 2 
2 fe / 2 r[(iV-2)/2] 



(19) 



where denotes the fcth moment (see Bain (1969)). From Stirling's formula, ^(^-2(7)) 



Hk{X)(l + Oln^ 1 )). Therefore, from equation (18), the central moments of g are in the 
limit (up to an 0(1) multiplier J r ) the central moments of the N(<5, (Nq(l — q))^ 1 ) dis- 
tribution, so the order conditions are satisfied. 



3.4 The gamma distribution 

From our many simulations, it has become apparent that the gamma family with proba- 
bility density functions 

f(t) = 1 f^p-t/P 

is a very good fit to the distribution of Q under the null hypothesis of equal standardized 
mean differences. For a random variable T with a gamma distribution, the shape param- 
eter a is given by a = (E[T]) 2 /Var[T] and the scale parameter f3 is given by Var[T]/E[T]. 
The chi-square distribution with v degrees of freedom is a member of the gamma family 
with a = u/2 and (5 = 2. 

To verify the fit of the gamma family to the null distribution of Q, we simulated 
a number of empirical distributions of Q and used the statistics package Statgraphics 
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Centurion XV (from Statpoint, Inc.) to compare the fit of these empirical distributions 
with a variety of distribution families. The gamma family always was the best, typically 
with a Kolmogorov-Smirnov (K-S) distance of only 0.002, which indicates a remarkably 
good fit. The second best fitting family was the chi-square family with fractional degrees 
of freedom which typically had a K-S distance of four times that of the best fitting gamma 
distribution. 



4 Examples 

In this section, we present two examples to illustrate the application of the theory of 
Sections [2] and [3] to real data. Our program, available at 

http:/ /www. imperial. ac.uk/stathelp/researchprojects/metaanalysis can be used 
to perform the calculations for these examples. 



4.1 Meta-analysis of the use of a placebo for pain relief 



As a first example, consider the meta-analysis by Hrobjartsson & G0tzsche (2004) of 17 
randomized clinical trials comparing the use of a placebo for pain against no treatment 
at all. Summary data from the meta-analysis is found in Table [TJ 

Because different studies used different measurement scales for evaluating pain, the 
standardized mean difference is used in the meta-analysis in order to place each of the 
effects on a scale free basis. The effect from each study appears in the table in the column 



headed g. The weights (from Equation 12) which appear in the last column of the table 
are given as percentages for ease of comparison, but the actual weights are needed for 
computation of the Q statistic. The actual weights can be computed using the weight 
total which is W = 212.91. The weighted average of the effects is g w = —0.338. The value 
of Cochran's Q statistic is 22.07. Using the standard chi-square approximation with 16 
degrees of freedom provides the p- value of 0.141 for the test for homogeneity. 

To use the results from Sections [2] and |3j first the weights need to be recalculated 
to reflect the null hypothesis of equal standardized mean differences. We take this null 
value (as found above) to be g w = —0.338 for each of the 17 studies and recalculate the 
weights using Equation [l2j Then the estimated first and second moments of the null 
distribution of Q can be calculated from Equations [2] and [3] and the Appendix yielding 
the values E[Q] = 15.19 and E[Q 2 ] = 257.57 respectively. Thus the estimated parameters 
of the approximating gamma distribution are a = 8.96 (shape parameter) and (3 = 1.70 
(scale parameter). The p- value corresponding to Q = 22.07 is 0.098. The p- value for a 
chi-square distribution with E(Q) = 15.19 degrees of freedom is 0.112. 

To assess the relative accuracy of the three approximations (gamma and chi-square 
with 16 and with 15.19 degrees of freedom) to the null distribution of Q, we conducted 
a simulation of 100,000 random samples with seventeen studies having the same sizes as 
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Study 


71 rp 




J 1 




Xn 


Sr 1 
°C 


n 


w % 


Reading 1982 


18 


1.60 


1.30 


20 


2.30 


2.00 


-0.402 


4.3 


Conn 1986 


13 


28.20 


18.40 


14 


44.40 


15.70 


-0.921 


2.8 


Hashish 1986 


25 


16.00 


11.70 


50 


30.00 


18.90 


-0.821 


7.2 


Hashish 1988 


25 


42.00 


25.00 


25 


60.00 


23.00 


-0.738 


5.4 


Harereaves 1 989 


25 


4.50 


2.50 


25 


4.90 


2.40 


-0.161 


5.8 


Blanchard 1990b 


18 


11.90 


23.90 


24 


20.70 


34.80 


-0.282 


4.7 


Rlanchard 1990a 


13 


8.30 


13.60 


11 


22.50 


25.10 


-0.697 


2.5 


Sprott 1993 


10 


7.90 


3.00 


10 


7.40 


3.00 


0.160 


2.3 


Forster 1994 


15 


3.20 


2.80 


15 


4.60 


2.20 


-0.541 


3.3 


Parker 1995 


49 


4.00 


1.90 


45 


3.80 


2.20 


0.097 


10.9 


Rowbotham 1996 

X \ v / VV I. J v_/ U XXGvXXX 1 i_/t/U 


35 


-4.40 


8.70 


35 


1.90 


8.70 


-0.716 


7.6 


Wang 1997 


25 


10.70 


7.30 


26 


13.40 


5.80 


-0.404 


5.8 


Robinson 2001 


13 


3.85 


3.48 


10 


4.25 


3.74 


-0.107 


2.6 


Cupal 2001 


10 


2.70 


0.95 


10 


2.70 


1.34 


0.000 


2.3 


Rawling 2001 


89 


5.30 


4.72 


96 


5.60 


4.90 


-0.062 


21.6 


Kotani 2001 


23 


15.00 


4.50 


24 


18.00 


6.00 


-0.554 


5.2 


Lin 2002 


25 


30.20 


14.40 


25 


38.10 


16.00 


-0.511 


5.6 



Table 1: Data on placebo interventions for pain, Hrobjartsson & G0tzsche (2004)- The 
data are on clinician-rated pain scales. The subscripts T and C refer to the treatment 
and control arms of the studies. The column headed g contains the estimated standardized 
mean differences between the two arms of each study and the column headed w are the 
weights (as percentages) used in computing the Q statistic. 



those of Hrobjartsson & G0tzsche (2004), but with all studies having the null value of 
the standardized mean difference S = —0.338. The comparisons are as follows, where the 
notation 'true' null refers to the simulation of 100,000 samples: 





p-value for Q = 22.07 


E[Q] 


E[Q 2 ] 


a 


P 


simulation ('true' null) 


0.108 


15.22 


260.76 






chi-square est-df 


0.112 


15.19 








gamma 


0.098 


15.19 


257.57 


8.96 


1.70 


chi-square 16 df 


0.141 


16 


288 







The p-value produced by the gamma distribution and especially that from the chi- 
square distribution with fractional degrees of freedom are substantially closer to the 'true' 
p-value as given by the simulations. Notice that the first and second moments of the 



12 



'true' null distribution of Q are smaller than the corresponding moments of the chi- 
square distribution, indicating the need for corrections. Our formulas produce an excellent 
approximation of the first moment. The approximation for the second moment is much 
better than that given by the chi-square distribution, but it is not nearly as good as the 
approximation of the first moment. 



4.2 Meta-analysis of light therapy for depression 

For a second example, consider the data from a meta-analysis of five studies to determine 
the effect of light therapy for non-seasonal depression (bright light vs standard treatment), 



Tuunainen et al. (2004). See Table 2 for the summary data. 



Study 


ut 




st 


n c 


X c 


sc 


9 


w(%) 


Holsboer 1994 


14 


14.50 


5.59 


14 


8.64 


8.38 


0.80 


23.2 


Fritzsche 2001b 


10 


15.80 


5.30 


10 


16.90 


6.40 


-0.18 


17.8 


Fritzsche 2001a 


11 


10.01 


8.60 


9 


9.50 


3.80 


0.07 


17.7 


Prasko 2002 


11 


17.00 


11.20 


9 


13.00 


7.90 


0.39 


17.3 


Bcnedetti 2003 


18 


11.72 


9.25 


12 


18.75 


7.78 


-0.79 


24.0 



Table 2: Data from a meta-analysis of light therapy for non-seasonal depression (bright 



light vs standard treatment), Tuunainen et al. (2004)- The data are on clinician-rated 
mood scales. The subscripts T and C refer to the treatment and control arms of the 
studies. The column headed g contains the standardized mean differences between the two 
arms of each study and the column headed w are the weights (as percentages) used in 
computing the Q statistic. 



The outcomes of the treatments were measured on a clinician-rated mood scales. The 
standardized mean difference statistic was used in the meta-analysis because different 
mood-scale scores were used in different studies. The weighted average of the effects is 
0.0437. The total of the weights is 27.1. The value of Cochran's Q statistic is 8.86, and 
the standard chi-square approximation with 4 degrees of freedom provides the p-value of 
0.065 for the test for homogeneity. 

To use the results from Sections [2] and |3j first the weights need to be recalculated 
to reflect the null hypothesis of equal standardized mean differences. We take this null 
value (as found above) to be g w = 0.0437 for each of the 5 studies and recalculate the 
weights using Equation 12 Then the formulas yield the following results. The estimated 
first and second moments of the null distribution of Q are E[Q] = 3.70 and E[Q 2 ] = 19.37 
respectively. Thus the estimated parameters of the approximating gamma distribution are 
a = 2.41 (shape parameter) and (3 = 1.54 (scale parameter). The p-values corresponding 
to Q = 22.07 are 0.037 (gamma approximation) and 0.053 (chi-square with 3.70 degrees 
of freedom). 



13 



To assess the relative accuracy of the gamma and chi-square approximations to the 
null distribution of Q, we conducted a simulation of 100,000 random samples with five 



studies of the same sizes as that of Tuunainen et al. (2004), but with all studies having 



the null value of the standardized mean difference 6 = 0.437. The comparisons are as 
follows where the notation 'true' null refers to the simulation of 100,000 samples: 





p-value for Q = § 


.86 


E[Q] 


E[Q 2 ] 


a 


P 


simulation ('true' null) 


0.050 




3.74 


20.95 






chi-square est-df 


0.053 




3.70 








gamma 


0.037 




3.70 


19.37 


2.41 


1.54 


chi-square 4 df 


0.065 




4 


24 







Notice again that the first and second moments of the 'true' null distribution of Q are 
smaller than the corresponding moments of the chi-square distribution. Our formulas pro- 
duce better approximations of these moments, but even with these better approximations 
the p-value of the approximating gamma distribution is only slightly more accurate than 
that produced by the chi-square distribution. The p-value from the chi-square distribu- 
tion with 3.70 d.f. (0.053) is very close to that of the simulations (0.050). The sample 
sizes which appear in this meta-analysis (about 10 patients in each of the two arms of the 
studies) are simply too small for the asymptotics implicit in our formulas for the second 
moment of Q to be valid. It is somewhat surprising, but gratifying, that the method 
based on the chi-square distribution with fractional d.f. is so accurate in this example. 
For meta-analyses with samples of such small sizes, perhaps the best method of finding 
a p-value associated with the obtained value of Q is the bootstrap type procedure which 
we used above: conduct a large simulation with the sample sizes of the actual data and 
the weighted average of the effects used as a null value. 



4.3 Generalizations from the examples 

There are some features of the examples which are common not only to the two examples 
but also to all the simulations we have conducted. We wish to comment on some of 
these here. Notice that the mean of the null distribution for Q found via the simulations 
is somewhat less than the chi-square mean of I — 1; and the second moment of Q is 
substantially less than the chi-square second moment of I 2 — 1. These facts appear to 
be general. The formulas of Sections [2] and [3] which we use for estimating the mean and 
second moment of Q underestimate both the moments but provide estimates which are 
substantially closer than the chi-square values to the simulated values. The formula which 
estimates the mean seems to be very accurate, but the formula for estimating the second 
moment is not as good. The over-estimation by the chi-square approximation results, as 



is well known (see for example Viechtbauer (2007)), in a conservative hypothesis test; that 
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is, the null hypothesis is not rejected often enough. The underestimation by our formulas 
results in a slightly liberal hypothesis test when the gamma approximation is used, but in 
general the p-values are closer to the true values than the chi-square approximation is to 
the true values. The chi-square with estimated E(Q) degrees of freedom provides nearly 
perfect fit. 

The fit of the gamma family of distributions to the empirical distribution of Q is 
remarkably close. The inaccuracy in the p-values given by our gamma approximation 
appears to be due to the underestimation of the second moment of Q. In fact, if we 
were able to accurately estimate the second moment of Q, then the estimated p-values 
would agree with the simulated p-values in our examples to three decimals. We do not 
understand the reason why the expansion for E[Q 2 ] is not more accurate, or why it 
always seems to underestimate the second moment. Resolution of this question is an area 
of possible future research. 



5 Simulations 



The simulations were performed using the R programming language ( R Development Core 
Team 2004). The details of the simulations are presented in four tables (Tables [6j [7[ [8 
and [9 ) , all of which compare the Q test using the usual chi-square approximation to the 
Q test using the gamma approximation and the chi-square approximation with fractional 
degrees of freedom presented in this article. Table [6] contains results of the Q test under 
the null hypothesis in the situation where all studies have the same size, the treatment 
and control arms are equal, and the combined effect 5 is estimated by g w . Table [7] contains 
results similar to that of Table [7j but here the combined effect is estimated by g^. (See 
the end of Section 3.1 for the distinction between cja and g w .) Table[I]also contains results 
of the Q test under the null hypothesis, but in the situation in which the study sizes are 
not equal. Finally Table M contains simulation results about the power of the Q test. 



5.1 Simulations under the null hypothesis: equal study sizes 

Since \jNq(l — q)g/J ~ tN-2(^ Nq(l — q)5) the values of g could be simulated directly 
from the appropriately scaled non-central t-distribution. In this case the quality of sim- 
ulations would depend on the implementation of the noncentral t. Instead we calcu- 
lated g from the first principles, using er<7 = a? = 1, and simulating sample means 
X c ~ N(0,tiq), X T ~ N(S, n^ 1 ) and sample variances [n c — l)s|* ~ Xn c -i an< ^ 
(n T - 1)4 ~ xlr-v 

The first series of simulations was performed for the situation in which all / of the 
studies have equal sample sizes. The data pattern used in the first series of simulations 
are described in Tables [3j Each data pattern was replicated 100,000 times. The results 
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of these simulations for the case of equal treatment and control arms (q = 1/2) appear in 
Tables [7] and [6] 



/ (number of studies) 


5, 10, 20, 50 


N (total size of both arms of each study) 


20, 30, 40, 100, 200 


q (proportion of each study size in the control arm) 


1/2, 3/4 


5 (null value of the SMD) 


0, 0.2, 0.5, 1, 2 



Table 3: Data pattern of the simulations used in Tables\7\ and Tables [?|[6| for the Type I 
error in the Q test 



The choice of S values was determined by the standard convention (Cohen, 1988) that 
the 5 values of 0.2 and 0.5 constitute small and medium effect sizes, respectively. Instead 
of using the traditional 'large' effect size of 0.8, we moved beyond to values of 1 and 2 to 
explore the possible consequence on the Q test of very large values of 5. Previous simu- 



lations by Viechtbauer (2007) did not uncover any such consequence for 5 values up to 0.8. 



Four p-values were obtained for each value of Q calculated from one of the 100,000 
replications: the standard chi-square based p-value; the p-value based on the gamma 
approximation using the known value of 5 together with the formulas given in Equations [2] 
and [3] the p-value based on the gamma approximation using the estimated null value of 
S together with the formulas given in Equations [2] and [3] and the p-value based on the 
chi-square approximation using the estimated degrees of freedom equal to E(Q). These 
p-values were then compared to the levels a = 0.05 and a = 0.1 to obtain the type I 
errors of each approximation at the 5% and 10% nominal levels. In the tables below these 
values are denoted by Xa, T^ - , T s a , and X%(Q), a respectively 

In addition to the three p-values (Xa> ^a; anc ^ X%(Q), a )> Table |6] contains the first two 
moments of Q calculated from our formulas with known 5 (denoted E/[Q] and E/[<5 2 ] in 
the table, where the subscript / denotes a result calculated from our approximation for- 
mulas) and their sample counterparts Q and Q 2 ; Table [7] additionally provides the fourth 
p-value T^ 1 , the variance Vary[<5] and the sample variance s 2 (Q). These data permit us 
to judge the accuracy of the formulas which give approximations for the moments of Q 
by comparing the formula values with the simulated distribution of Q. 



Results of the simulations with equal study sizes 

The first set of simulations can be used to answer two types of questions: how accurate 
are the moments estimated by our formulas — especially compared to the accuracy of the 
standard chi-square approximation?; and how accurate are the p-values (at the nominal 
levels 0.05 and 0.10) given by the gamma approximation and the chi-square approximation 
with fractional degrees of freedom — especially in comparison with the p-values produced 
by the standard chi-square approximation? We begin with the moments. 
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Accuracy of the approximating moments 

The simulations provide us with sample estimates of the moments of Q denoted Q and 
Q 2 , which we take to be 'true' values. Thus we can estimate the relative error in the first 
moment of the two approximations by (Ef[Q]/Q — 1) X 100% and ((I — l)/Q — 1) x 100%; 
and similarly estimate the relative errors in the second moments by (Ef[Q 2 ]/Q 2 — 1) x 100% 
and ((J 2 - l)/Q 2 - 1) x 100%. 

The three graphs of Figure [T] provide a summary of the comparison of the two approx- 
imations to the first moment. 



- T - chi sq 20 
- chi-sq 50 




20 30 « 10D 230 

N (total size of each study) 



5 10 20 50 
I (number of studies) 



0.0 0.2 0.5 1.0 2.0 
delta (standardized mean difference) 



Figure 1: Relative error of two approximations to the mean of Q as a function of the 
total sample size of each study N (left), of the number of studies I (center), and of the 
standardized mean difference 5 (right). The lower curves are based on Equation^ and the 
upper curves are from the chi-square first moment. On the first and the second plots the 
null value of the SMD 5 is fixed at 0.5. On the rightmost plot, the number of studies is 
fixed at I = 20. 



We see that Ej[Q] is generally quite accurate although it slightly underestimates Q. In 
fact the relative error in Ef[Q] is almost always less than 3%, is less than 1% for samples 
of size N = 30, and is essentially perfect beginning with sample sizes of iV = 40. In 
contrast, the chi-square moment is always too large, with relative errors more than 10% 
when N=20 and around 5% when N = 30 or 40. Except for the case when the number 
of studies is small (1 = 5), the relative error of the chi-square first moment remains as 
high as 1-2% even when the study sizes are as large as iV = 200. We also see from the 
graphs that the relative errors do not seem to depend on the number of studies I or on the 
standardized mean difference S, with the exception that for the chi-square approximation 
the relative error in the first moment increases slightly for the very large (and somewhat 
unrealistic) values of 5 = 1 and 2. 

The three graphs of Figure [2] provide a summary of the comparison of the percent error 
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in the approximation of the second moment E[Q 2 ] by the two approximating distributions. 




5 r— r 



N (total size of each study) 



5 10 a 50 
I (number of studies) 




0.0 0.2 0.5 LO 2.0 
delta (standardized mean difference) 



Figure 2: Relative error of two approximations to the second moment of Q as a function 
of the total sample size of each study N (left), of the number of studies I (center), and 
of the standardized mean difference 5 (right). The lower curves are based on Equation^ 
and the upper curves are from the chi-square second moment. On the first and the second 
plots the null value of the SMD 5 is fixed at 0. 5. On the rightmost plot, the number of 
studies is fixed at I = 20. 



We see that the chi-square approximation overestimates the second moment while our 
formula underestimates the second moment, but by a smaller amount. The percent error 
for both approximations decreases as total sample size N increases. The chi-square error 
starts at about 20% for N = 20 and decreases to 9% for N = 40 and at N = 100 the error 
is still in the 2-3% range. In contrast, the error using our formula starts at about 9% for 
iV = 20, decreases to less than 2% for N = 40 and at iV = 100 the error is less than 1%. 

We see from the graphs that the relative error in the second moment does not appear 
to have much dependence on the number of studies /, except that there is a small differ- 
ence in error for the very small number of studies 1 = 5. The relative error for the formula 
values Ej[Q 2 ] seems to be independent of 8, but surprisingly there is some increase in the 
relative error of the chi-square approximation as S increases, especially for the very large 
values of 5 = 1 and 2. 



Accuracy of significance levels: two-moment gamma vs standard chi-square approximation 
The dependence of the achieved level on the size of the studies N for our gamma and 
the standard chi-square approximations can be seen graphically in Figure |3j 

The type I error of the Q test of homogeneity using the standard chi-square approxi- 
mation is considerably lower than the nominal level, and hence the standard test is very 
conservative. This conservativeness is a well known fact; our simulations agree with the 
simulations of Sanchez-Meca & Maryn-Martynez (1997), Viechtbauer (2007), and others. 
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Figure 3: Achieved levels of the Q test at the nominal level of 0.05 (left) and 0.1 (right) 
using two approximations, as a function of the total sample size of each study N . The 
upper curves are from the gamma approximation and the lower curves are from the chi- 
square approximation. The null value of the SMD 5 is fixed at 0.5. To better show details, 
the data for N = 20 have been omitted. 



Because the standard test is so conservative, there is a well known recommendation to 



use the 10% significance level for the Q test (see Petitti (2001), among others). Our 
simulations confirm that this recommendation is certainly justified; for 10 or more small 
studies (N = 20), the type I error at the 10% significance level is closer to 5% than to 
10%. 

In contrast, our gamma approximation is somewhat liberal for small values of N. In 
fact, for total study sizes as small as N = 20 the gamma approximation is sufficiently 
poor that we do not recommend it. For N = 30 the true level seems to be in between the 
two approximations. Starting from N = 40 the gamma approximation works better than 
the standard chi-square approximation. For a fixed value of /, the performance of both 
approximations improves with the study size, but the improvement is considerably faster 
for the gamma approximation. For N = 100 the gamma approximation delivers perfect 
results, whereas the chi-square approximation is still too conservative. 

For fixed study size N, the accuracy of the achieved levels decays as the number 
of studies I increases. For example, for the gamma approximation, studies of size 40 
(and even size 30) provide reasonably accurate levels when there are only 1 = 5 studies. 
However when the number of studies increases to I = 50, then larger study sizes are 
necessary to achieve accurate levels. For / = 50, studies of size 40 are not large enough, 
but studies of size 100 give excellent results. For an intermediate number of I = 20 
studies, the study size of iV = 40 gives reasonably accurate levels producing levels of 
about 0.055 and 0.108 for nominal levels of 5% and 10% respectively. The pattern is 
similar for the chi-square approximation: meta-analyses with many studies require large 
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Figure 4: Achieved levels of the Q test at the nominal level of 0.05 (left) and 0.1 (right) 
using two approximations, as a function of the number of studies I . The upper curves 
are from the gamma approximation and the lower curves are from the chi-square approx- 
imation. The null value of the SMD 5 is fixed at 0.5. To better show details, the data for 
N = 20 have been omitted. 



sample sizes for accuracy. But in all cases, the chi-square performs less well than the 
gamma approximation. The dependence of the behavior of the achieved levels on I can 
be seen in Figure |4| 

The simulations show that the type I error of the standard chi-square test decreases as 
the effect size 5 increases. Thus the test is even more conservative for larger effect sizes. 
However, the gamma approximation improves as the effect size S increases, contrasting 
with the worsening of the chi-square approximation. The dependence of the behavior of 
the achieved levels on 5 can be seen in Figure [5} 
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Figure 5: Achieved levels of the Q test at the nominal level of 0.05 (left) and 0.1 (right) 
using two approximations, as a function of the standardized mean difference 5. The upper 
curves are from the gamma approximation and the lower curves are from the chi-square 
approximation. The number of studies is fixed at I = 20. To better show details, the data 
for N = 20 have been omitted. 



Accuracy of significance levels for the chi-square approximation with fractional degrees of 
freedom 

The results of simulations to do with the fractional chi-square test are not included in 
the figures. As can be seen from Table |6j in every instance, the fractional chi-square test 
is superior to the usual chi-square test. Most importantly for applications is the fact that 
the improvement given by the fractional chi-square is substantial for small to moderate 
sample sizes, from N = 20. As examples of this improvement, consider the case of / = 20 
studies and S = 0.5. The simulations indicate the following improvements in the achieved 
level at the two nominal levels of 0.05 and 0.10: for iV = 20 the achieved levels improve 
from 0.021 to 0.046 and from 0.050 to 0.098, respectively; for iV = 40 from 0.035 to 0.047 
and from 0.076 to 0.096, respectively; and even for study size as large as N = 100, the 
achieved levels improve from 0.044 to 0.048 and from 0.090 to 0.099, respectively. 

Other results of the equal study size simulations 

First, the simulations of Table [6] were repeated with equal total study sizes as before, 
but with each study having an unbalanced design with three-quarters of the study size 
present in the control arm (q = 3/4). The results were so similar to that of the balanced 
studies that we have not included either a table of the results analogous to Table [6] nor 
graphical displays of the data. 

Second, there is not much difference between the type I error with a known value of S 
(denoted by T*^ in Table [7]) and the type I error with an estimated null value of 5 (denoted 
by in Tables § and Q . Of course, only the latter test can be used in practice. 
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Finally, the results in Table ^ used the estimated null value of 5 = J2 w i$i/ J2 w i- 111 
Table [7] the simulat ions were repeated for using 5 = A^bij A^ 1 instead. It is known 
(Yuan & Bushman, 2002) that the former, more natural, estimator is a biased estimator 



of the combined null value of 5. Does the choice of estimator of S affect the results? It 
can be seen that the only noticeable differences are for N = 20 and 5 = 2. Then the 
constant weights A^ 1 provide p- values closer to those obtained using the known value of 
S when using gamma approximation. Interestingly, the inverse variance weights provide 
p-values closer to nominal for K = 10 and K = 20, but not for K = 5 or K = 50. These 
differences are only academic though, we do not recommend our gamma approximation 
for N = 20 in any case, and 5 = 2 is much too large. Thus, there is no practical difference 
between the two choices, take your pick. 



5.2 Simulations under the null hypothesis: unequal study sizes 

The second series of simulations used unequal study sizes. We have followed a suggestion of 



Sanchez-Meca & Maryn-Martynez (2000), who selected the following study sizes with the 
skewness of 1.464 which they consider typical for meta-analyses in the field of behavioral 
and health sciences: the set Ni with average study of sixty, consisting of individual sizes 
{24,32,36,40,168}; the set N 2 with average study size of 100, consisting of individual 
sizes {64,72,76,80,208} and the set N 3 with average study size of 160, consisting of 
individual study sizes {124, 132, 136, 140, 268}. We have taken the studies to be balanced, 
thus dividing each study size equally between the two study arms. The simulations were 
run for 1 = 5, 10 and 20. For meta-analyses with / = 10 and / = 20, the same set 
of sample sizes was repeated twice or four times, respectively. The data patterns of the 
simulations are summarized in Table HI 



/ (number of studies) 


5, 10, 20 


N (average and (individual) study sizes) 


60 (24, 32, 36, 40, 168) 




100 (64, 72, 76, 80, 208) 




160 (124, 132, 136, 140, 268) 


q (proportion of each study in the control arm) 


1/2 


S (null value of the SMD) 


0.5 



Table 4: Data pattern of the simulations used in Table [S| for the Type I error in the Q 
test for unequal study sizes 



Results of the simulations with unequal study sizes 

The results of the simulations with unequal study sizes are given in Table |8j The 
approximation of the moments is excellent. The first moments given by the formulas are 
nearly exact (relative error less than 2%) and the second moments have relative error 
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less than 3%, compared with relative errors of the chi-square first and second moments of 
more than 5% and 10% respectively. 

The significance levels are similar to those obtained from the simulations for equal 
study sizes. The chi-square approximation yields a conservative test, while the gamma 
approximation yields a liberal test which is closer to the nominal levels. At the significance 
levels of 0.05 and 0.10, the gamma approximation is nearly perfect for the larger two 
sizes N = 100 and 160 while the error in the level of the chi-square approximation is 
substantial even for the largest size of N = 160. For the smaller size of N = 60, the 
gamma approximation has an error of roughly half that of the chi-square approximation. 
Graphical displays of the levels are shown in Figure |6j Once more, the results from the 
fractional chi-squre approximation are nearly perfect even for the smallest sample sizes. 
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Figure 6: Achieved levels of the Q test at the nominal levels of 0.05 (left) and 0.1 
(right) using two approximations, as a function of the average sample size of each study 
N . The sample sizes are unequal in this figure. The upper curves are from the gamma 
approximation and the lower curves are from the chi-square approximation. I is the 
number of studies. The standardized mean difference has been fixed at 5 = 0.5. 



5.3 Comparison of the power of the Q tests 

The standard Q test using the chi-square approximation is well known to have low power 
(see for example Viechtbauer ( 2007[ )). In this section we report on simulations to see 
how the power of the Q test is improved by the use of our moment approximations. To 
this end, we adopt the random effects model that the heterogeneity in effects among the 
several studies is modeled by the assumption that the effect <5j of the ith study is normally 
distributed about a fixed mean 5 and with variance r 2 . Then the null (homogeneity) 
hypothesis becomes r 2 = and alternatives are measured by the magnitude of r 2 . In the 
simulations, we have taken 5 = 0.5, a 'medium' effect size, and have varied r 2 from 0.025 
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to 0.25. We compared the power of the standard and the improved tests in the range 
from N = 20 to iV = 80 where we expected noticeable differences: in general, the power 
of the standard Q test is considered not to be sufficient for N < 80 (Viechtbauer ( 2007| )). 
The data patterns for the power simulations are specified in Table [5] 



I (number of studies) 


5, 10, 20, 50 


N (equal study sizes) 


20, 30, 40, 50, 60, 80 


q (proportion of each study in the control arm) 


1/2 


S (null value of the SMD) 


0.5 


t 2 (variance of random effect) 


0.025, 0.05, 0.1, 0.15, 0.20, 0.25 


a (nominal significance level of the test) 


0.05, 0.10 



Table 5: Data pattern of the simulations used in Table\f^for the power of the Q test. 



We conducted 10,000 repetitions for each configuration. We simulated within-study 
parameters 5i ~ N(5, r 2 ), i = 1,---,I and then simulated the values of c/i directly 
from the appropriately scaled non-central t-distribution with non-centrality parameter 
\jNq(l — q)5i. The results of the simulations appear in Table JoJ. 

Results of the power simulations 

Since the test based on the gamma approximation is liberal, its power is higher than 
the power of the conservative standard test. We note that the power of the test using the 
fractional chi-square distribution is also always higher than the test using the standard chi- 
square approximation. In this discussion, we focus on the magnitude of the improvement 
in power rather than on the power for the tests separately. The most striking result 
of the simulations is that the power improvement increases as the number of studies 
increases and as the sizes of the studies decrease. The greatest improvement in power for 
the fractional chi-square test in comparison to the standard test (based on the range of 
our simulations) is 21 percentage points which occurred for / = 50 studies, study sizes 
N = 20, and for r 2 = 0.1. Maximum improvement for the other values of I were 12 
percentage points for I = 20, 7 percentage points for / = 10 and 4 percentage points for 
1 = 5, all occurring at the smallest study size of N = 20. As the study sizes N increase 
from N = 20 to N = 40, the improvement in power for the fractional chi-square test 
decreases by roughly two-thirds. Finally we note that the increase in power at the two 
different levels of 0.05 and 0.10 were quite similar to each other. 

Since the gamma approximation is recommended only for N > 40, we consider this 
range when comparing the power of the test based on gamma approximation to the 
standard chi-square test. The greatest improvement in power is 11 to 12 percentage 
points which occurred for the largest number of studies / = 50 and the smallest study 
sizes N = 40. Maximum improvement for the other values of / were 7 percentage points 
for / = 20, 5 percentage points for / = 10 and 3 percentage points for 1 = 5. As the 
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study sizes N increased from N = 40 to N = 80, the improvement decreased by roughly 
half. Once more, the increase in power at the two different levels of 0.05 and 0.10 were 
quite similar to each other. 

6 Summary and concluding discussion 

The main focus of this paper is the improvement of the test for homogeneity commonly 
used in meta-analysis by referring Cochran's Q statistic to a more accurate distribution. 
In this paper, we have considered the situation in which the Q statistic is a function of 
only one parameter and have applied the results to the case in which the effect of interest 
is measured by the standardized mean difference (SMD or Cohen's d statistic), a measure 
which is frequently used in meta-analytic applications. We have presented expansions 
for the first two moments of Q which are accurate to order 0(1 /n). These expansions 
thus offer corrections of order 0(1 /n) to the corresponding moments of the chi-square 
approximation to the distribution of Q. These expansions are the first that we are aware 
of to include the situation in which the weights in the Q statistic are not independent of 
the effects (as is the case with the SMD). 

We considered two options to approximate the distribution of Q for the SMD: the 
use of a gamma distribution with moments matching those of the expansions or by 
the chi-square distribution with fractional degrees of freedom matching the first mo- 
ment. Both approximations result in improved Q tests for homogeneity when the ef- 
fects are measured by the SMD. To facilitate the substantial computations necessary for 
these improved tests, a computer program in the i?-language can be downloaded from 
http:/ /www. imperial. ac.uk/stathelp/researchprojects/metaanalysis. 

Our simulations show that the improved test for the SMD using the gamma distribu- 
tion is somewhat liberal (rejecting the null hypothesis more often than appropriate); in 
contrast, the currently used test which uses the chi-square distribution is well known to be 
conservative. But the improved test based on gamma approximation is quite accurate for 
study sizes of 40 or more (for example, 20 subjects in each arm of a randomized clinical 
trial). 

However our recommended test is based not on the gamma approximation but on 
the use of the fractional chi-square distribution whose first moment matches that of the 
expansion. In applications, the parameter in the expansion will need to be estimated 
from the data. Thus our recommended approximating distribution of Q (namely Xmqi) i s 
data dependent as opposed to the now standard approximating distribution of Q (namely 
Xi-i) which is data independent. The result is an improved Q test for homogeneity when 
the effects are measured by the SMD. 

Simulations show that our recommended improved Q test for homogeneity yields a 
substantial improvement over the standard test in accuracy of achieved significance levels, 
especially for small to moderate study sizes. In addition the improved test provides an 
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increase in power. The simulations show that the improved test works quite well in 
a variety of circumstances, such as when the individual studies have unbalanced sizes 
between the two arms or when the studies have substantially different total sizes from 
each other. 

An important limitation of this paper, which is intended to be the first in a series, is 
the restriction to the one parameter case. In future work, we plan to extend our expan- 
sions to the two parameter case and to provide applications to important meta-analytic 
measures such as the risk difference and the odds ratio. 
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A Appendix 



Equation [3] which approximates the second moment of Q needs expressions for various 
derivatives of Q 2 with respect to 9. These derivatives are provided below. But for ease of 
reference, we first reproduce Equation [3j 



+ IsEE^ mm + sSEEa||pi| E[efiE[e»iE[eg 

Here are the derivatives of Q 2 needed for the above formula. 



d A [Q 2 } 

def 

d*[Q 2 } 
defde] 
d 5 [Q 2 } 

dOf 
d 5 [Q 2 } 

defde 2 

d 6 [Q 2 } 
d9f 

d e [Q 2 } 

defde 2 



d & [Q 2 



oe 2 de 2 de 2 



24w 2 U 2 



WiWj UiUj + 



2wiWj 
W 2 



2AUiWj 



720U? 
48 



w 2 r 



^-48- 
dBi W 



UU- + ^ 
UlU i + w 2 . 



dfc 

do. 



Ui 



2w { \ ( dfi 



W J \ dJ9 4 



Wi 



de 2 



w 4 



-TWUiWjiW 2 ^ - AWwj + 9wiWj 



+W 2 U i w j (W 2 - Wwj - Wwi + QwiW 
-8WUiWi(W 2 Ui - Wwj + 3wiWj 



d 2 fi 



" de 2 

dfi\ ( dfj ' 
dej \dO jt 



+ wf{-2W + 3 Wi ) ( ) + W 2 U lW sd2fj 



df 3 



de, 



3 

-^^{Ww 3 + Ww k - 9w jWk ) I — J 



(20) 
(21) 
(22) 
(23) 

(24) 
(25) 



(26) 



27 



+ 8 ^?(W<"j + W<" l -6^) d2/i 



A; 



+ 8 ^(Ww i + Ww k - 6w lWk )^ 



3 

8wiWi , , d 2 f k 



'k 

32w k 



W 2 
32wj 



HI '111 ■ 

UiUj(W - Qw k ) + -^J-(W- 12w k ) + 3w k 
U t U k (W - 6^) + - 12 Wj ) + 3wj 



W 2 L J W 2 

32u>,- 



W 2 



£/,-£4(W - 6^) + ^ - 12^) + 3^ 



28 



References 

Bain, L. (1969). Moments of a noncentral t and noncentral F-Distribution. The American 
Statistician 23, 33-34. 

Biggerstaff, B. and Jackson, D. (2008). The exact distribution of Cochrans heterogeneity- 
statistic in one-way random effects meta-analysis. Statistics in Medicine 27, 6093-6110. 

Biggerstaff, B. and Tweedie, R. (1997). Incorporating variability in estimates of het- 
erogeneity in the random effects model in meta-analysis. Statistics in Medicine 16, 
753-768. 

Cochran, W. (1937). Problems arising in the analysis of a series of similar experiments. 
JRSS4, 102-118. 

Cohen, J. 1988, Statistical power analysis for the behavioral sciences (2nd ed.) (Hillsdale, 
NJ: Lawrence Earlbaum Associates). 

Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27-8. 

Hedges, L. and Olkin, I. 1985, Statistical methods for meta-analysis (Orlando: Academic 
Press). 

Hrobjartsson, A. and G0tzsche, P. (2004). Placebo interventions for all clinical con- 
ditions. Cochrane Database of Systematic Reviews Art. No.: CD003974. DOI: 
10.1002/14651858. CD003974.pub2. 

Jackson, D. (2006). The power of the standard test for the presence of heterogeneity in 
meta-analysis. Statistics in Medicine 25, 2688-2699. 

James, G. (1951). The Comparison of Several Groups of Observations when the Ratios 
of the Population Variances are unknown. Biometrika 38, 324-329. 

Johnson, N., Kotz, S., and Balakrishnan, N. 1995, Continuous Univariate Distributions, 
Vol. 2 (New York: John Wiley & Sons). 

Kulinskaya, E. and Dollinger, M. (2007). Robust weighted one-way ANOVA: Improved 
approximation and efficiency. Journal of Statistical Planning and Inference 137, 462- 
472. 

Kulinskaya, E., Dollinger, M., Knight, E., and Gao, H. (2004). A Welch-type test for 
homogeneity of contrasts under heteroscedasticity with application to meta-analysis. 
Statistics in Medicine 23, 3655-3670. 



29 



Kulinskaya, E., Staudte, R., and Gao, H. (2003). Power approximations in testing for 
unequal means in a one-way ANOVA weighted for unequal variances. Communications 
in Statistics — Theory and Methods 32, 2353-2371. 

Normand, S.-L. (1999). Meta-analysis: Formulating, evaluating, combining, and report- 
ing. Statistics in Medicine 18, 321-359. 

Petitti, D. (2001). Approaches to heterogeneity in meta-analysis. Statistics in Medicine 
20, 3625-3633. 

R Development Core Team. 2004, R: A language and environment for statistical comput- 
ing, R Foundation for Statistical Computing, Vienna, Austria, iSBN 3-900051-00-3. 

Sanchez-Meca, J. and Maryn-Martynez, F. (1997). Homogeneity tests in meta-analysis: 
A Monte Carlo comparison of statistical power and Type I error. Quality & Quantity 
31, 385-399. 

Sanchez-Meca, J. and Maryn-Martynez, F. (2000). Testing the significance of a common 
risk difference in meta-analysis. Computational Statistics & Data Analysis 33, 299-313. 

Tuunainen, A., Kripke, D., and Endo, T. (2004). Light therapy for non-seasonal 
depression. Cochrane Database of Systematic Reviews Art. No.: CD004050. DOI: 
10.1002/14651858. CD004050.pub2. 

Viechtbauer, W. (2007). Hypothesis tests for population heterogeneity in meta-analysis. 
British Journal of Mathematical and Statistical Psychology 60, 29-60. 

Welch, B. (1951). On the comparison of several mean values: an alternative approach. 
Biometrika 38, 330-336. 

Yuan, K.-H. and Bushman, B. (2002). Combining standardized mean differences using 
the method of maximum likelihood. Psychometrika 67, 589-608. 



30 



I 


N 


5 


X 2 05 


1 .05 


X-E(Q),. 05 


X 2 ! 


r s ! 


Xe(Q),.i 


Ef(Q) 


Q 


Ef(Q 2 ) 


Q 2 


5 


10 





0.014 


NA 


0.052 


0.041 


NA 


0.119 


2.8 


3.2 


-12.2 


14.9 


5 


10 


0.2 


0.014 


NA 


0.051 


0.039 


NA 


0.119 


2.7 


3.2 


-14.3 


14.8 


5 


10 


0.5 


0.013 


NA 


0.052 


0.037 


NA 


0.120 


2.6 


3.1 


-22.7 


14.5 


5 


10 


1 


0.012 


NA 


0.055 


0.035 


NA 


0.127 


2.5 


3.1 


-24.4 


14.1 


5 


10 


2 


0.008 


NA 


0.039 


0.027 


NA 


0.097 


2.8 


3.0 


134.3 


12.8 


5 


14 





0.026 


NA 


0.044 


0.061 


NA 


0.097 


3.4 


3.5 


12.7 


18.1 


5 


14 


0.2 


0.026 


NA 


0.044 


0.062 


NA 


0.097 


3.4 


3.5 


12.4 


18.0 


5 


14 


0.5 


0.025 


NA 


0.045 


0.061 


NA 


0.098 


3.3 


3.5 


10.9 


17.9 


5 


14 


1 


0.023 


NA 


0.043 


0.056 


NA 


0.097 


3.2 


3.4 


8.9 


17.3 


5 


14 


2 


0.019 


NA 


0.040 


0.049 


NA 


0.093 


3.2 


3.3 


20.0 


16.3 


5 


16 





0.030 


0.105 


0.045 


0.068 


0.161 


0.096 


3.5 


3.6 


15.7 


19.1 


5 


16 


0.2 


0.030 


0.110 


0.044 


0.068 


0.164 


0.097 


3.5 


3.6 


15.4 


19.0 


5 


16 


0.5 


0.029 


0.124 


0.044 


0.066 


0.178 


0.095 


3.5 


3.6 


14.5 


18.8 


5 


16 


1 


0.028 


0.160 


0.044 


0.063 


0.212 


0.097 


3.4 


3.5 


13.0 


18.3 


5 


16 


2 


0.022 


0.061 


0.040 


0.055 


0.112 


0.091 


3.3 


3.4 


17.8 


17.2 


5 


20 





0.034 


0.069 


0.045 


0.075 


0.123 


0.096 


3.7 


3.7 


18.5 


20.3 


5 


20 


0.2 


0.034 


0.070 


0.045 


0.075 


0.123 


0.096 


3.6 


3.7 


18.4 


20.2 


5 


20 


0.5 


0.034 


0.073 


0.046 


0.074 


0.127 


0.095 


3.6 


3.7 


17.9 


20.1 


5 


20 


1 


0.032 


0.081 


0.045 


0.072 


0.134 


0.095 


3.6 


3.6 


16.9 


19.7 


5 


20 


2 


0.028 


0.058 


0.042 


0.064 


0.110 


0.092 


3.5 


3.6 


18.1 


18.7 


5 


30 





0.040 


0.055 


0.047 


0.084 


0.106 


0.096 


3.8 


3.8 


21.1 


21.6 


5 


30 


0.2 


0.039 


0.054 


0.046 


0.084 


0.106 


0.096 


3.8 


3.8 


21.0 


21.5 


5 


30 


0.5 


0.040 


0.056 


0.047 


0.084 


0.108 


0.096 


3.8 


3.8 


20.8 


21.5 


5 


30 


1 


0.038 


0.058 


0.046 


0.082 


0.109 


0.096 


3.8 


3.8 


20.3 


21.2 


5 


30 


2 


0.036 


0.054 


0.045 


0.078 


0.105 


0.095 


3.7 


3.7 


20.1 


20.7 


5 


40 





0.042 


0.051 


0.046 


0.088 


0.102 


0.096 


3.9 


3.8 


22.0 


22.0 


5 


40 


0.2 


0.044 


0.053 


0.048 


0.090 


0.104 


0.098 


3.9 


3.9 


22.0 


22.4 


5 


40 


0.5 


0.043 


0.054 


0.048 


0.088 


0.103 


0.097 


3.8 


3.9 


21.9 


22.3 


5 


40 


1 


0.041 


0.054 


0.047 


0.087 


0.104 


0.096 


3.8 


3.8 


21.5 


21.9 


5 


40 


2 


0.039 


0.052 


0.046 


0.082 


0.102 


0.094 


3.8 


3.8 


21.2 


21.4 


5 


100 





0.048 


0.051 


0.050 


0.097 


0.101 


0.100 


3.9 


4.0 


23.3 


23.4 


5 


100 


0.2 


0.048 


0.051 


0.049 


0.096 


0.100 


0.099 


3.9 


3.9 


23.3 


23.4 


5 


100 


0.5 


0.046 


0.050 


0.048 


0.095 


0.100 


0.098 


3.9 


3.9 


23.3 


23.3 


5 


100 


1 


0.046 


0.050 


0.048 


0.095 


0.100 


0.098 


3.9 


3.9 


23.2 


23.1 


5 


100 


2 


0.045 


0.050 


0.048 


0.093 


0.100 


0.098 


3.9 


3.9 


23.0 


23.0 


5 


200 





0.049 


0.051 


0.050 


0.098 


0.101 


0.100 


4.0 


4.0 


23.7 


23.8 


5 


200 


0.2 


0.049 


0.050 


0.050 


0.097 


0.100 


0.099 


4.0 


4.0 


23.7 


23.6 


5 


200 


0.5 


0.048 


0.049 


0.049 


0.097 


0.099 


0.099 


4.0 


4.0 


23.7 


23.6 



31 



5 


200 


1 


0.049 


0.050 


0.050 


0.097 


0.100 


0.099 


4.0 


4.0 


23.6 


23.7 


5 


200 


2 


0.048 


0.050 


0.049 


0.098 


0.102 


0.101 


4.0 


4.0 


23.5 


23.7 


10 


10 


o 


0.007 


NA 


0.068 


0.022 


NA 


0.149 


5.8 


7.1 


-29.7 


60.2 


10 


10 


0.2 


0.007 


NA 


0.070 


0.022 


NA 


0.152 


5.8 


7.1 


-35.7 


60.4 


10 


10 


0.5 


0.007 


NA 


0.072 


0.022 


NA 


0.157 


5.6 


7.0 


-58.3 


59.4 


10 


10 


1 


0.006 


NA 


0.073 


0.019 


NA 


0.162 


5.4 


6.9 


-46.2 


57.2 


10 


10 


2 


0.004 


NA 


0.034 


0.013 


NA 


0.085 


6.9 


6.6 


523.3 


52.9 


10 


14 


o 


0.019 


NA 


0.047 


0.047 


NA 


0.101 


7.4 


7.8 


54.2 


73.6 


10 


14 


0.2 


0.019 


NA 


0.046 


0.046 


NA 


0.102 


7.4 


7.8 


53.2 


73.7 


10 


14 


0.5 


0.018 


NA 


0.047 


0.045 


NA 


0.103 


7.3 


7.8 


48.9 


72.9 


10 


14 


1 


0.016 


NA 


0.047 


0.041 


NA 


0.104 


7.2 


7.7 


44.1 


71.2 


10 


14 


2 


0.011 


NA 


0.037 


0.032 


NA 


0.085 


7.3 


7.4 


88.7 


66.5 


10 


16 


o 


0.022 


0.156 


0.044 


0.053 


0.214 


0.096 


7.8 


8.0 


65.1 


77.3 


10 


16 


0.2 


0.022 


0.167 


0.045 


0.053 


0.224 


0.099 


7.7 


8.0 


64.5 


77.5 


10 


16 


0.5 


0.022 


0.213 


0.045 


0.053 


0.263 


0.098 


7.7 


7.9 


61.7 


76.6 


10 


16 


1 


0.020 


0.310 


0.044 


0.048 


0.344 


0.097 


7.5 


7.9 


57.6 


74.8 


10 


16 


2 


0.016 


0.043 


0.039 


0.041 


0.088 


0.089 


7.5 


7.7 


78.1 


71.0 


10 


20 


o 


0.028 


0.082 


0.046 


0.064 


0.137 


0.095 


8.1 


8.2 


76.3 


82.6 


10 


20 


0.2 


0.028 


0.083 


0.046 


0.064 


0.139 


0.095 


8.1 


8.2 


75.9 


82.3 


10 


20 


0.5 


0.028 


0.091 


0.046 


0.064 


0.150 


0.098 


8.1 


8.2 


74.3 


82.1 


10 


20 


1 


0.026 


0.101 


0.045 


0.060 


0.158 


0.095 


8.0 


8.1 


71.4 


80.2 


10 


20 


2 


0.022 


0.052 


0.041 


0.052 


0.104 


0.090 


7.9 


8.0 


77.0 


77.3 


10 


30 


o 


0.036 


0.058 


0.046 


0.076 


0.109 


0.095 


8.5 


8.5 


86.5 


88.6 


10 


30 


0.2 


0.036 


0.058 


0.047 


0.077 


0.111 


0.096 


8.5 


8.5 


86.4 


88.3 


10 


30 


0.5 


0.035 


0.059 


0.046 


0.077 


0.113 


0.096 


8.5 


8.5 


85.6 


88.3 


10 


30 


1 


0.034 


0.062 


0.046 


0.074 


0.115 


0.096 


8.4 


8.5 


84.0 


87.1 


10 


30 


2 


0.030 


0.052 


0.043 


0.068 


0.103 


0.092 


8.3 


8.3 


83.6 


84.5 


10 


40 


o 


0.039 


0.053 


0.046 


0.082 


0.104 


0.095 


8.6 


8.6 


90.4 


90.9 


10 


40 


0.2 


0.040 


0.054 


0.047 


0.084 


0.106 


0.098 


8.6 


8.7 


90.3 


91.5 


10 


40 


0.5 


0.041 


0.056 


0.049 


0.084 


0.108 


0.099 


8.6 


8.7 


89.9 


91.9 


10 


40 


1 


0.037 


0.054 


0.046 


0.080 


0.107 


0.096 


8.6 


8.6 


88.7 


90.2 


10 


40 


2 


0.035 


0.052 


0.045 


0.076 


0.102 


0.094 


8.5 


8.5 


87.7 


88.3 


10 


100 


o 


0.045 


0.050 


0.048 


0.094 


0.101 


0.099 


8.9 


8.9 


96.0 


96.1 


10 


100 


0.2 


0.045 


0.050 


0.048 


0.091 


0.098 


0.096 


8.9 


8.9 


96.0 


95.8 


10 


100 


0.5 


0.046 


0.051 


0.049 


0.094 


0.101 


0.099 


8.9 


8.9 


95.8 


96.2 


10 


100 


1 


0.046 


0.051 


0.049 


0.093 


0.102 


0.099 


8.8 


8.8 


95.5 


95.7 


10 


100 


2 


0.043 


0.050 


0.048 


0.090 


0.100 


0.098 


8.8 


8.8 


94.8 


95.0 


10 


200 





0.047 


0.049 


0.048 


0.095 


0.099 


0.098 


8.9 


8.9 


97.6 


96.9 


10 


200 


0.2 


0.049 


0.051 


0.050 


0.098 


0.101 


0.100 


8.9 


8.9 


97.5 


97.5 



10 


200 


0.5 


0.048 


0.051 


0.050 


0.098 


0.101 


0.100 


8.9 


8.9 


97.5 


97.7 


10 


200 


1 


0.048 


0.050 


0.049 


0.097 


0.101 


0.100 


8.9 


8.9 


97.3 


97.6 


10 


200 


2 


0.045 


0.049 


0.047 


0.093 


0.098 


0.097 


8.9 


8.9 


96.9 


96.4 


20 


10 


o 


0.003 


NA 


0.103 


0.011 


NA 


0.206 


12.0 


14.8 


-34.5 


241.8 


20 


10 


0.2 


0.003 


NA 


0.102 


0.011 


NA 


0.205 


11.9 


14.8 


-49.7 


240.3 


20 


10 


0.5 


0.003 


NA 


0.109 


0.010 


NA 


0.219 


11.5 


14.7 


-105.9 


237.6 


20 


10 


1 


0.002 


NA 


0.116 


0.008 


NA 


0.231 


11.3 


14.5 


-59.6 


231.0 


20 


10 


2 


0.001 


NA 


0.031 


0.005 


NA 


0.076 


15.2 


14.0 


1481.0 


214.1 


20 


14 


o 


0.013 


NA 


0.052 


0.033 


NA 


0.110 


15.5 


16.4 


229.0 


294.8 


20 


14 


0.2 


0.012 


NA 


0.051 


0.032 


NA 


0.111 


15.5 


16.3 


226.1 


294.1 


20 


14 


0.5 


0.012 


NA 


0.053 


0.031 


NA 


0.112 


15.3 


16.3 


213.7 


292.4 


20 


14 


1 


0.010 


NA 


0.053 


0.027 


NA 


0.115 


15.0 


16.1 


200.3 


285.7 


20 


14 


2 


0.006 


NA 


0.037 


0.020 


NA 


0.087 


15.5 


15.7 


328.6 


269.9 


20 


16 


o 


0.016 


NA 


0.048 


0.040 


NA 


0.103 


16.2 


16.8 


267.9 


311.0 


20 


16 


0.2 


0.017 


NA 


0.048 


0.040 


NA 


0.103 


16.2 


16.8 


266.0 


310.6 


20 


16 


0.5 


0.015 


NA 


0.048 


0.038 


NA 


0.105 


16.1 


16.7 


257.5 


307.4 


20 


16 


1 


0.014 


NA 


0.049 


0.035 


NA 


0.106 


15.8 


16.6 


245.1 


302.6 


20 


16 


2 


0.010 


0.037 


0.040 


0.027 


0.081 


0.089 


15.9 


16.2 


304.3 


287.1 


20 


20 


o 


0.022 


0.101 


0.046 


0.052 


0.162 


0.098 


17.0 


17.3 


308.9 


330.7 


20 


20 


0.2 


0.022 


0.102 


0.046 


0.051 


0.162 


0.097 


17.0 


17.3 


307.8 


329.2 


20 


20 


0.5 


0.021 


0.115 


0.046 


0.050 


0.176 


0.098 


16.9 


17.3 


302.8 


328.4 


20 


20 


1 


0.020 


0.135 


0.047 


0.048 


0.196 


0.099 


16.7 


17.1 


293.2 


323.5 


20 


20 


2 


0.016 


0.053 


0.042 


0.040 


0.107 


0.091 


16.6 


16.8 


308.9 


312.0 


20 


30 


o 


0.031 


0.062 


0.046 


0.069 


0.115 


0.097 


17.9 


17.9 


348.4 


355.0 


20 


30 


0.2 


0.032 


0.063 


0.047 


0.069 


0.116 


0.097 


17.9 


17.9 


347.9 


355.9 


20 


30 


0.5 


0.030 


0.063 


0.046 


0.066 


0.116 


0.095 


17.8 


17.9 


345.5 


353.3 


20 


30 


1 


0.030 


0.068 


0.048 


0.066 


0.124 


0.098 


17.7 


17.8 


339.8 


351.4 


20 


30 


2 


0.025 


0.054 


0.043 


0.057 


0.106 


0.091 


17.5 


17.6 


337.4 


340.8 


20 


40 


o 


0.036 


0.056 


0.047 


0.077 


0.108 


0.098 


18.2 


18.2 


363.8 


367.3 


20 


40 


0.2 


0.036 


0.056 


0.048 


0.077 


0.106 


0.096 


18.2 


18.2 


363.5 


367.3 


20 


40 


0.5 


0.035 


0.056 


0.047 


0.076 


0.108 


0.096 


18.2 


18.2 


361.9 


364.8 


20 


40 


1 


0.034 


0.058 


0.048 


0.074 


0.111 


0.098 


18.1 


18.2 


357.9 


364.0 


20 


40 


2 


0.031 


0.053 


0.045 


0.068 


0.104 


0.094 


17.9 


18.0 


353.7 


356.3 


20 


100 





0.045 


0.051 


0.049 


0.092 


0.103 


0.100 


18.7 


18.7 


386.5 


388.1 


20 


100 


0.2 


0.044 


0.050 


0.048 


0.091 


0.102 


0.099 


18.7 


18.7 


386.4 


387.0 


20 


100 


0.5 


0.044 


0.051 


0.048 


0.090 


0.101 


0.098 


18.7 


18.7 


386.0 


385.6 


20 


100 


1 


0.044 


0.051 


0.049 


0.089 


0.101 


0.097 


18.7 


18.7 


384.5 


384.8 


20 


100 


2 


0.042 


0.051 


0.048 


0.088 


0.101 


0.098 


18.6 


18.6 


381.9 


382.5 


20 


200 





0.048 


0.051 


0.050 


0.096 


0.101 


0.100 


18.9 


18.9 


393.0 


394.3 



20 


200 


0.2 


0.047 


0.050 


0.049 


0.097 


0.102 


0.101 


18.9 


18.9 


393.0 


393.6 


20 


200 


0.5 


0.046 


0.050 


0.049 


0.096 


0.101 


0.100 


18.9 


18.9 


392.7 


393.0 


20 


200 


1 


0.047 


0.051 


0.050 


0.095 


0.100 


0.099 


18.8 


18.9 


392.1 


392.8 


20 


200 


2 


0.046 


0.050 


0.049 


0.094 


0.100 


0.099 


18.8 


18.8 


390.6 


391.6 


50 


20 


o 


0.014 


0.169 


0.049 


0.034 


0.231 


0.103 


43.8 


44.6 


1945.7 


2064.6 


50 


20 


0.2 


0.014 


0.173 


0.050 


0.034 


0.235 


0.102 


43.7 


44.5 


1940.1 


2059 9 


50 


20 


0.5 


0.013 


0.210 


0.051 


0.034 


0.269 


0.107 


43.5 


44.5 


1914.7 


2057.5 


50 


20 


1 


0.011 


0.285 


0.050 


0.030 


0.333 


0.107 


43.0 


44.1 


1862.5 


2022.4 


50 


20 


2 


0.008 


0.082 


0.042 


0.022 


0.140 


0.091 


42.9 


43.3 


1907.4 


1950.6 


50 


30 


o 


0.023 


0.070 


0.046 


0.053 


0.127 


0.096 


46.0 


46.2 


2182.5 


2218.9 


50 


30 


0.2 


0.024 


0.073 


0.048 


0.055 


0.130 


0.099 


45.9 


46.3 


2179.8 


2226.4 


50 


30 


0.5 


0.023 


0.076 


0.048 


0.054 


0.133 


0.099 


45.8 


46.1 


2166.7 


2215.6 


50 


30 


1 


0.022 


0.084 


0.049 


0.051 


0.141 


0.100 


45.5 


46.0 


2134.5 


2197.4 


50 


30 


2 


0.017 


0.065 


0.043 


0.042 


0.119 


0.092 


45.2 


45.4 


2110.0 


2137.9 


50 


40 


o 


0.030 


0.061 


0.048 


0.066 


0.115 


0.099 


46.9 


47.0 


2277.4 


2298.3 


50 


40 


0.2 


0.030 


0.060 


0.047 


0.066 


0.114 


0.098 


46.9 


47.0 


2275.6 


2295.2 


50 


40 


0.5 


0.029 


0.060 


0.046 


0.063 


0.114 


0.097 


46.8 


46.9 


2266.9 


2290.6 


50 


40 


1 


0.029 


0.065 


0.049 


0.063 


0.120 


0.099 


46.6 


46.8 


2243.9 


2277.0 


50 


40 


2 


0.024 


0.058 


0.045 


0.055 


0.110 


0.093 


46.2 


46.4 


2214.5 


2234.3 


50 


100 


o 


0.041 


0.051 


0.049 


0.085 


0.101 


0.098 


48.2 


48.2 


2419.8 


2420.3 


50 


100 


0.2 


0.041 


0.051 


0.049 


0.086 


0.103 


0.100 


48.2 


48.2 


2419.2 


2416.6 


50 


100 


0.5 


0.041 


0.052 


0.049 


0.086 


0.103 


0.100 


48.2 


48.3 


2416.3 


2426.5 


50 


100 


1 


0.040 


0.051 


0.048 


0.083 


0.101 


0.097 


48.1 


48.1 


2408.0 


2407.3 


50 


100 


2 


0.038 


0.051 


0.047 


0.081 


0.102 


0.098 


48.0 


48.0 


2391.7 


2395.3 


50 


200 





0.046 


0.050 


0.049 


0.093 


0.100 


0.099 


48.6 


48.6 


2460.7 


2461.6 


50 


200 


0.2 


0.046 


0.051 


0.050 


0.094 


0.102 


0.100 


48.6 


48.6 


2460.5 


2459.7 


50 


200 


0.5 


0.046 


0.051 


0.050 


0.093 


0.100 


0.099 


48.6 


48.6 


2459.1 


2459.3 


50 


200 


1 


0.046 


0.051 


0.050 


0.092 


0.101 


0.099 


48.6 


48.6 


2455.1 


2456.8 


50 


200 


2 


0.044 


0.051 


0.049 


0.091 


0.101 


0.099 


48.5 


48.5 


2446.5 


2450.8 



Table 6: Type I error of the standard Q test and the improved Q 
test for homogeneity (gamma- and Xmg\ approximations) under 
the null and moments of the distribution of Q. Sample sizes are 
equal and balanced. The column headings are defined in Section \5. 1\ 
Here 8 = Y^, w i^i/W. 
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I 


N 


8 


X 2 05 


■pth 
1 .05 


1 .05 


X 2 ! 


-pth 
1 .1 




E f (Q) 


Q 


E f (Q 2 ) 


Q 2 


Vax f (Q) 


s 2 {Q) 


5 


20 


0.0 


0.035 


0.068 


0.070 


0.077 


0.120 


0.122 


3.7 


3.7 


18.5 


20.1 


5.2 


6.6 


5 


20 


0.2 


0.034 


0.068 


0.069 


0.074 


0.120 


0.122 


3.6 


3.7 


18.4 


20.0 


5.1 


6.6 


5 


20 


0.5 


0.034 


0.074 


0.075 


0.075 


0.128 


0.129 


3.6 


3.7 


17.9 


20.1 


4.8 


6.6 


5 


20 


1.0 


0.032 


0.083 


0.082 


0.071 


0.137 


0.136 


3.6 


3.6 


16.9 


19.6 


4.2 


6.4 


5 


20 


2.0 


0.028 


0.054 


0.050 


0.066 


0.108 


0.104 


3.5 


3.6 


18.1 


18.9 


5.8 


6.1 


5 


30 


0.0 


0.041 


0.055 


0.056 


0.086 


0.107 


0.107 


3.8 


3.8 


21.1 


21.8 


6.7 


7.2 


5 


30 


0.2 


0.041 


0.056 


0.056 


0.086 


0.107 


0.107 


3.8 


3.8 


21.0 


21.7 


6.6 


7.3 


5 


30 


0.5 


0.039 


0.056 


0.056 


0.084 


0.107 


0.107 


3.8 


3.8 


20.8 


21.5 


6.5 


7.1 


5 


30 


1.0 


0.039 


0.058 


0.059 


0.083 


0.111 


0.111 


3.8 


3.8 


20.3 


21.3 


6.2 


7.1 


5 


30 


2.0 


0.036 


0.054 


0.053 


0.078 


0.106 


0.106 


3.7 


3.7 


20.1 


20.7 


6.4 


6.8 


5 


40 


0.0 


0.043 


0.053 


0.053 


0.089 


0.103 


0.104 


3.9 


3.9 


22.0 


22.3 


7.1 


7.4 


5 


40 


0.2 


0.043 


0.053 


0.053 


0.089 


0.103 


0.103 


3.9 


3.9 


22.0 


22.4 


7.1 


7.4 


5 


40 


0.5 


0.043 


0.054 


0.054 


0.089 


0.103 


0.103 


3.9 


3.9 


21.9 


22.3 


7.0 


7.4 


5 


40 


1.0 


0.042 


0.054 


0.054 


0.089 


0.106 


0.105 


3.8 


3.8 


21.5 


22.1 


6.9 


7.3 


5 


40 


2.0 


0.039 


0.052 


0.049 


0.084 


0.104 


0.102 


3.8 


3.8 


21.2 


21.5 


6.9 


7.0 


5 


100 


0.0 


0.047 


0.050 


0.050 


0.095 


0.100 


0.100 


4.0 


4.0 


23.3 


23.3 


7.7 


7.8 


5 


100 


0.2 


0.047 


0.050 


0.050 


0.096 


0.100 


0.100 


4.0 


4.0 


23.3 


23.4 


7.7 


7.8 


5 


100 


0.5 


0.048 


0.052 


0.052 


0.097 


0.102 


0.102 


4.0 


4.0 


23.3 


23.5 


7.7 


7.9 


5 


100 


1.0 


0.047 


0.050 


0.050 


0.096 


0.102 


0.102 


3.9 


4.0 


23.2 


23.4 


7.7 


7.8 


5 


100 


2.0 


0.045 


0.049 


0.049 


0.093 


0.100 


0.100 


3.9 


3.9 


23.0 


23.0 


7.6 


7.6 


5 


200 


0.0 


0.048 


0.050 


0.050 


0.097 


0.099 


0.099 


4.0 


4.0 


23.7 


23.5 


7.9 


7.8 


5 


200 


0.2 


0.049 


0.050 


0.050 


0.098 


0.100 


0.100 


4.0 


4.0 


23.7 


23.8 


7.9 


7.9 


5 


200 


0.5 


0.049 


0.050 


0.050 


0.099 


0.101 


0.101 


4.0 


4.0 


23.7 


23.7 


7.9 


7.9 


5 


200 


1.0 


0.049 


0.050 


0.050 


0.098 


0.100 


0.100 


4.0 


4.0 


23.6 


23.7 


7.9 


7.9 


5 


200 


2.0 


0.048 


0.050 


0.050 


0.097 


0.101 


0.101 


4.0 


4.0 


23.5 


23.4 


7.8 


7.8 


10 


20 


0.0 


0.030 


0.082 


0.083 


0.065 


0.137 


0.139 


8.1 


8.2 


76.3 


82.7 


10.3 


14.9 


10 


20 


0.2 


0.027 


0.081 


0.082 


0.063 


0.138 


0.139 


8.1 


8.2 


75.9 


82.1 


10.1 


14.7 


10 


20 


0.5 


0.028 


0.090 


0.091 


0.061 


0.147 


0.148 


8.1 


8.2 


74.3 


81.7 


9.2 


14.5 


10 


20 


1.0 


0.025 


0.104 


0.102 


0.060 


0.163 


0.161 


8.0 


8.1 


71.4 


80.4 


8.0 


14.3 


10 


20 


2.0 


0.022 


0.043 


0.039 


0.051 


0.093 


0.089 


7.9 


8.0 


77.0 


77.0 


15.1 


13.5 


10 


30 


0.0 


0.037 


0.059 


0.059 


0.078 


0.111 


0.111 


8.5 


8.5 


86.5 


88.8 


14.4 


16.1 


10 


30 


0.2 


0.036 


0.059 


0.059 


0.078 


0.111 


0.111 


8.5 


8.5 


86.4 


88.6 


14.4 


16.0 


10 


30 


0.5 


0.035 


0.059 


0.060 


0.077 


0.113 


0.113 


8.5 


8.5 


85.6 


88.1 


14.1 


15.9 


10 


30 


1.0 


0.034 


0.062 


0.062 


0.073 


0.114 


0.114 


8.4 


8.4 


84.0 


86.8 


13.5 


15.6 


10 


30 


2.0 


0.031 


0.053 


0.053 


0.070 


0.106 


0.105 


8.3 


8.4 


83.6 


85.0 


14.7 


15.2 


10 


40 


0.0 


0.040 


0.054 


0.054 


0.084 


0.105 


0.106 


8.6 


8.7 


90.4 


91.6 


15.7 


16.6 


10 


40 


0.2 


0.040 


0.054 


0.055 


0.084 


0.105 


0.106 


8.6 


8.7 


90.3 


91.5 


15.7 


16.7 


10 


40 


0.5 


0.040 


0.055 


0.055 


0.082 


0.104 


0.105 


8.6 


8.6 


89.9 


91.1 


15.5 


16.5 
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10 


40 


1.0 


0.038 


0.055 


0.054 


0.080 


0.107 


0.105 


8.6 


8.6 


88.7 


90.3 


15.2 


16.2 


10 


40 


2.0 


0.034 


0.050 


0.048 


0.075 


0.101 


0.099 


8.5 


8.5 


87.7 


88.1 


15.5 


15.7 


10 


100 


0.0 


0.047 


0.051 


0.051 


0.095 


0.103 


0.103 


8.9 


8.9 


96.0 


96.5 


17.3 


17.7 


10 


100 


0.2 


0.047 


0.051 


0.051 


0.094 


0.101 


0.101 


8.9 


8.9 


96.0 


96.4 


17.3 


17.5 


10 


100 


0.5 


0.046 


0.051 


0.051 


0.093 


0.100 


0.100 


8.9 


8.9 


95.8 


96.0 


17.3 


17.4 


10 


100 


1.0 


0.044 


0.050 


0.050 


0.091 


0.099 


0.099 


8.9 


8.8 


95.5 


95.1 


17.2 


17.2 


10 


100 


2.0 


0.044 


0.050 


0.050 


0.090 


0.099 


0.099 


8.8 


8.8 


94.8 


94.5 


17.1 


17.1 


10 


200 


0.0 


0.047 


0.050 


0.050 


0.097 


0.101 


0.101 


8.9 


8.9 


97.6 


97.4 


17.7 


17.6 


10 


200 


0.2 


0.049 


0.051 


0.051 


0.098 


0.101 


0.101 


8.9 


9.0 


97.6 


98.0 


17.7 


17.8 


10 


200 


0.5 


0.048 


0.050 


0.050 


0.097 


0.100 


0.100 


8.9 


8.9 


97.5 


97.6 


17.7 


17.7 


10 


200 


1.0 


0.047 


0.050 


0.050 


0.097 


0.101 


0.101 


8.9 


9.0 


97.3 


97.8 


17.6 


17.7 


10 


200 


2.0 


0.046 


0.049 


0.049 


0.094 


0.099 


0.099 


8.9 


8.9 


96.9 


96.9 


17.6 


17.4 


20 


20 


0.0 


0.022 


0.100 


0.101 


0.053 


0.160 


0.161 


17.0 


17.3 


308.9 


330.5 


18.6 


30.9 


20 


20 


0.2 


0.022 


0.102 


0.103 


0.052 


0.161 


0.162 


17.0 


17.3 


307.8 


329.0 


18.2 


31.0 


20 


20 


0.5 


0.022 


0.117 


0.118 


0.050 


0.179 


0.179 


16.9 


17.3 


302.8 


328.6 


16.0 


30.8 


20 


20 


1.0 


0.019 


0.138 


0.136 


0.047 


0.201 


0.199 


16.7 


17.1 


293.2 


323.1 


13.2 


30.0 


20 


20 


2.0 


0.015 


0.042 


0.038 


0.039 


0.092 


0.087 


16.6 


16.8 


308.9 


310.7 


32.5 


28.2 


20 


30 


0.0 


0.032 


0.063 


0.063 


0.070 


0.117 


0.117 


17.9 


18.0 


348.4 


357.5 


29.3 


33.8 


20 


30 


0.2 


0.032 


0.063 


0.063 


0.069 


0.117 


0.118 


17.9 


18.0 


347.9 


356.4 


29.2 


33.8 


20 


30 


0.5 


0.031 


0.063 


0.064 


0.067 


0.119 


0.119 


17.8 


17.9 


345.5 


354.2 


28.4 


33.4 


20 


30 


1.0 


0.029 


0.066 


0.066 


0.064 


0.121 


0.121 


17.7 


17.8 


339.8 


349.0 


27.1 


32.7 


20 


30 


2.0 


0.025 


0.053 


0.052 


0.058 


0.105 


0.104 


17.5 


17.6 


337.4 


340.8 


30.7 


31.6 


20 


40 


0.0 


0.035 


0.054 


0.055 


0.076 


0.107 


0.108 


18.2 


18.2 


363.8 


366.8 


32.6 


34.7 


20 


40 


0.2 


0.035 


0.054 


0.055 


0.076 


0.107 


0.107 


18.2 


18.2 


363.5 


366.8 


32.5 


34.5 


20 


40 


0.5 


0.035 


0.056 


0.056 


0.076 


0.109 


0.109 


18.2 


18.2 


361.9 


365.7 


32.1 


34.7 


20 


40 


1.0 


0.035 


0.058 


0.056 


0.073 


0.110 


0.109 


18.1 


18.2 


357.9 


364.0 


31.3 


34.3 


20 


40 


2.0 


0.030 


0.052 


0.050 


0.068 


0.102 


0.100 


17.9 


17.9 


353.7 


355.1 


32.4 


33.1 


20 


100 


0.0 


0.044 


0.051 


0.051 


0.091 


0.101 


0.101 


18.7 


18.7 


386.5 


386.7 


36.4 


36.9 


20 


100 


0.2 


0.045 


0.052 


0.052 


0.092 


0.102 


0.102 


18.7 


18.7 


386.4 


387.5 


36.4 


37.0 


20 


100 


0.5 


0.044 


0.051 


0.051 


0.091 


0.102 


0.102 


18.7 


18.7 


386.0 


386.7 


36.3 


36.7 


20 


100 


1.0 


0.044 


0.051 


0.051 


0.089 


0.101 


0.101 


18.7 


18.7 


384.5 


385.2 


36.1 


36.6 


20 


100 


2.0 


0.042 


0.049 


0.049 


0.086 


0.100 


0.100 


18.6 


18.6 


381.9 


382.4 


36.0 


36.0 


20 


200 


0.0 


0.047 


0.050 


0.050 


0.095 


0.100 


0.100 


18.9 


18.8 


393.0 


391.9 


37.3 


37.5 


20 


200 


0.2 


0.046 


0.049 


0.049 


0.094 


0.099 


0.099 


18.9 


18.9 


393.0 


392.9 


37.3 


37.1 


20 


200 


0.5 


0.047 


0.050 


0.050 


0.096 


0.101 


0.101 


18.9 


18.9 


392.7 


394.7 


37.3 


37.6 


20 


200 


1.0 


0.047 


0.051 


0.051 


0.094 


0.099 


0.099 


18.8 


18.8 


392.1 


391.0 


37.2 


37.3 


20 


200 


2.0 


0.045 


0.049 


0.049 


0.092 


0.099 


0.099 


18.8 


18.8 


390.6 


389.9 


37.1 


36.7 


50 


20 





0.014 


0.169 


0.169 


0.034 


0.230 


0.231 


43.8 


44.6 


1945.7 


2064.8 


29.2 


79.4 


50 


20 


0.2 


0.014 


0.176 


0.177 


0.035 


0.239 


0.240 


43.7 


44.6 


1940.1 


2065.0 


27.6 


80.0 



50 


20 


0.5 


0.014 


0.214 


0.216 


0.033 


0.274 


0.275 


43.5 


44.4 


1914.7 


2054.5 


20.6 


79.0 


50 


20 


1 


0.012 


0.290 


0.288 


0.030 


0.338 


0.336 


43.0 


44.1 


1862.5 


2019.2 


10.5 


76.8 


50 


20 


2 


0.008 


0.061 


0.057 


0.022 


0.116 


0.112 


42.9 


43.3 


1907.4 


1949.7 


67.4 


72.8 


50 


30 


o 


0.025 


0.073 


0.073 


0.055 


0.130 


0.130 


46.0 


46.3 


2182.5 


2226.6 


69.2 


87.0 


50 


30 


0.2 


0.024 


0.073 


0.073 


0.054 


0.129 


0.129 


45.9 


46.2 


2179.8 


2224.8 


68.7 


87.0 


50 


30 


0.5 


0.023 


0.076 


0.076 


0.052 


0.133 


0.133 


45.8 


46.1 


2166.7 


2213.2 


66.3 


85.9 


50 


30 


1 


0.022 


0.083 


0.083 


0.050 


0.142 


0.142 


45.5 


45.9 


2134.5 


2195.9 


61.8 


84.7 


50 


30 


2 


0.017 


0.063 


0.062 


0.042 


0.116 


0.115 


45.2 


45.3 


2110.0 


2136.7 


71.4 


80.6 


50 


40 


0.0 


0.030 


0.060 


0.060 


0.066 


0.113 


0.113 


46.9 


47.0 


2277.4 


2299.9 


80.8 


89.6 


50 


40 


0.2 


0.029 


0.059 


0.059 


0.065 


0.113 


0.113 


46.9 


47.0 


2275.6 


2297.9 


80.6 


89.2 


50 


40 


0.5 


0.030 


0.062 


0.062 


0.065 


0.114 


0.115 


46.8 


46.9 


2266.9 


2290.7 


79.3 


89.6 


50 


40 


1.0 


0.028 


0.065 


0.065 


0.062 


0.118 


0.119 


46.6 


46.8 


2243.9 


2277.9 


76.7 


88.1 


50 


40 


2.0 


0.024 


0.057 


0.057 


0.054 


0.109 


0.109 


46.2 


46.3 


2214.5 


2231.7 


79.3 


85.6 


50 


100 


0.0 


0.041 


0.051 


0.052 


0.086 


0.102 


0.102 


48.2 


48.2 


2419.8 


2419.3 


93.5 


94.7 


50 


100 


0.2 


0.042 


0.052 


0.052 


0.086 


0.102 


0.102 


48.2 


48.2 


2419.2 


2421.7 


93.4 


94.6 


50 


100 


0.5 


0.041 


0.052 


0.052 


0.085 


0.101 


0.101 


48.2 


48.2 


2416.3 


2419.5 


93.2 


94.2 


50 


100 


1.0 


0.040 


0.051 


0.051 


0.084 


0.102 


0.102 


48.1 


48.1 


2408.0 


2411.3 


92.6 


94.2 


50 


100 


2.0 


0.038 


0.051 


0.051 


0.081 


0.101 


0.101 


48.0 


48.0 


2391.7 


2394.1 


92.2 


93.1 


50 


200 





0.046 


0.051 


0.051 


0.095 


0.102 


0.102 


48.6 


48.7 


2460.7 


2466.4 


96.0 


96.8 


50 


200 


0.2 


0.045 


0.050 


0.050 


0.093 


0.101 


0.101 


48.6 


48.6 


2460.5 


2460.2 


96.0 


96.6 


50 


200 


0.5 


0.045 


0.050 


0.050 


0.093 


0.100 


0.100 


48.6 


48.6 


2459.1 


2456.6 


95.9 


96.5 


50 


200 


1 


0.046 


0.051 


0.051 


0.093 


0.102 


0.102 


48.6 


48.6 


2455.1 


2458.3 


95.7 


96.2 


50 


200 


2 


0.043 


0.049 


0.049 


0.089 


0.099 


0.099 


48.5 


48.4 


2446.5 


2439.0 


95.4 


95.7 



Table 7: Type I error of the standard Q test and the improved Q 
test for homogeneity under the null and moments of the distribution 
of Q. Sample sizes a re eq ual and balanced. The column headings 
are defined in Section 5.1 Here 5 — ^2 A~ 1 5i/ ^ A^ 1 . 
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I 


N 


X 2 05 


1 .05 


Xe(Q),.05 


x\ 


r s ! 


Xe(q),.i 


E f (Q) 


Q 


E f (Q 2 ) 


Q 2 


Va r/ (Q) 


s 2 (Q) 


5 


60 


0.041 


0.056 


0.048 


0.086 


0.107 


0.097 


3.8 


3.8 


21.1 


21.8 


6.7 


7.2 


5 


100 


0.046 


0.051 


0.048 


0.095 


0.102 


0.099 


3.9 


3.9 


23.0 


23.2 


7.6 


7.7 


5 


160 


0.050 


0.052 


0.051 


0.098 


0.101 


0.100 


4.0 


4.0 


23.5 


23.7 


7.8 


7.9 


10 


60 


0.038 


0.057 


0.047 


0.081 


0.110 


0.098 


8.6 


8.6 


88.0 


90.4 


14.8 


16.3 


10 


100 


0.045 


0.051 


0.049 


0.093 


0.102 


0.099 


8.8 


8.9 


95.0 


95.7 


17.1 


17.3 


10 


160 


0.048 


0.051 


0.049 


0.095 


0.100 


0.098 


8.9 


8.9 


96.9 


96.7 


17.5 


17.6 


20 


60 


0.034 


0.060 


0.048 


0.074 


0.113 


0.098 


18.0 


18.1 


356.3 


363.5 


30.7 


34.5 


20 


100 


0.043 


0.051 


0.048 


0.088 


0.101 


0.097 


18.6 


18.6 


382.8 


383.0 


35.9 


36.3 


20 


160 


0.046 


0.050 


0.049 


0.093 


0.100 


0.098 


18.8 


18.8 


390.3 


389.6 


36.9 


37.1 



Table 8: Type I error of the standard Q test and the improved Q 
test for homogeneity under the null and moments of the distribution 
of Q. Sample sizes are unequal but balanced. The column headings 
are defined in Section \5. 1\ 
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power at level 0.05 


power at level 0.10 


I 


N 


r 2 


xk-i 


Gamma 


X E(Q) 


Xk-i 


Gamma 




5 


20 


0.025 


0.051 


0.099 


0.065 


0.100 


0.160 


0.127 


5 


20 


0.05 


0.071 


0.134 


0.089 


0.136 


0.203 


0.163 


5 


20 


0.1 


0.121 


0.208 


0.148 


0.210 


0.294 


0.245 


5 


20 


0.15 


0.177 


0.274 


0.209 


0.277 


0.369 


0.316 


5 


20 


0.2 


0.230 


0.340 


0.266 


0.343 


0.438 


0.387 


5 


20 


0.25 


0.281 


0.397 


0.321 


0.399 


0.500 


0.442 


5 


30 


0.025 


0.072 


0.095 


0.082 


0.134 


0.164 


0.150 


5 


30 


0.05 


0.111 


0.142 


0.125 


0.190 


0.228 


0.210 


5 


30 


0.1 


0.196 


0.236 


0.214 


0.297 


0.339 


0.320 


5 


30 


0.15 


0.286 


0.336 


0.307 


0.402 


0.447 


0.426 


5 


30 


0.2 


0.369 


0.416 


0.388 


0.481 


0.522 


0.503 


5 


30 


0.25 


0.444 


0.491 


0.463 


0.551 


0.591 


0.574 


5 


40 


0.025 


0.092 


0.107 


0.099 


0.161 


0.182 


0.174 


5 


40 


0.05 


0.147 


0.168 


0.156 


0.232 


0.258 


0.247 


5 


40 


0.1 


0.271 


0.302 


0.286 


0.383 


0.413 


0.401 


5 


40 


0.15 


0.387 


0.418 


0.402 


0.500 


0.528 


0.515 


5 


40 


0.2 


0.486 


0.516 


0.501 


0.592 


0.612 


0.605 


5 


40 


0.25 


0.560 


0.590 


0.575 


0.662 


0.683 


0.674 


5 


50 


0.025 


0.110 


0.122 


0.117 


0.182 


0.199 


0.192 


5 


50 


0.05 


0.196 


0.214 


0.204 


0.292 


0.313 


0.304 


5 


50 


0.1 


0.330 


0.354 


0.342 


0.447 


0.467 


0.459 


5 


50 


0.15 


0.465 


0.488 


0.478 


0.572 


0.588 


0.582 


5 


50 


0.2 


0.570 


0.589 


0.581 


0.663 


0.680 


0.674 


5 


50 


0.25 


0.650 


0.671 


0.661 


0.731 


0.746 


0.740 


5 


60 


0.025 


0.127 


0.138 


0.132 


0.210 


0.224 


0.217 


5 


60 


0.05 


0.225 


0.242 


0.233 


0.326 


0.342 


0.336 


5 


60 


0.1 


0.394 


0.411 


0.403 


0.501 


0.517 


0.510 


5 


60 


0.15 


0.531 


0.547 


0.539 


0.633 


0.645 


0.641 


5 


60 


0.2 


0.633 


0.647 


0.640 


0.718 


0.729 


0.725 


5 


60 


0.25 


0.707 


0.720 


0.714 


0.782 


0.791 


0.788 


5 


80 


0.025 


0.160 


0.170 


0.165 


0.254 


0.264 


0.261 


5 


80 


0.05 


0.295 


0.309 


0.303 


0.406 


0.417 


0.413 


5 


80 


0.1 


0.496 


0.509 


0.504 


0.598 


0.608 


0.604 


5 


80 


0.15 


0.639 


0.648 


0.643 


0.717 


0.724 


0.721 


5 


80 


0.2 


0.728 


0.736 


0.732 


0.798 


0.804 


0.802 


5 


80 


0.25 


0.800 


0.806 


0.803 


0.853 


0.857 


0.856 
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10 


20 


0.025 


0.050 


0.141 


0.078 


0.103 


0.211 


0.150 


10 


20 


0.05 


0.083 


0.195 


0.118 


0.151 


0.277 


0.205 


10 


20 


0.1 


0.161 


0.310 


0.215 


0.258 


0.414 


0.324 


10 


20 


0.15 


0.256 


0.432 


0.319 


0.371 


0.531 


0.445 


10 


20 


0.2 


0.347 


0.532 


0.417 


0.471 


0.628 


0.544 


10 


20 


0.25 


0.445 


0.627 


0.515 


0.567 


0.708 


0.640 


10 


30 


0.025 


0.081 


0.121 


0.097 


0.150 


0.201 


0.180 


10 


30 


0.05 


0.147 


0.210 


0.178 


0.243 


0.305 


0.279 


10 


30 


0.1 


0.294 


0.369 


0.331 


0.410 


0.486 


0.455 


10 


30 


0.15 


0.438 


0.520 


0.482 


0.561 


0.628 


0.600 


10 


30 


0.2 


0.564 


0.638 


0.602 


0.676 


0.733 


0.709 


10 


30 


0.25 


0.668 


0.732 


0.702 


0.764 


0.811 


0.791 


10 


40 


0.025 


0.106 


0.135 


0.121 


0.184 


0.222 


0.207 


10 


40 


0.05 


0.203 


0.244 


0.223 


0.310 


0.354 


0.338 


10 


40 


0.1 


0.411 


0.464 


0.440 


0.538 


0.582 


0.568 


10 


40 


0.15 


0.600 


0.642 


0.627 


0.703 


0.738 


0.725 


10 


40 


0.2 


0.719 


0.753 


0.736 


0.799 


0.826 


0.815 


10 


40 


0.25 


0.803 


0.833 


0.821 


0.865 


0.885 


0.877 


10 


50 


0.025 


0.136 


0.159 


0.151 


0.224 


0.258 


0.246 


10 


50 


0.05 


0.267 


0.299 


0.285 


0.381 


0.415 


0.404 


10 


50 


0.1 


0.524 


0.556 


0.544 


0.634 


0.660 


0.652 


10 


50 


0.15 


0.698 


0.730 


0.716 


0.783 


0.806 


0.798 


10 


50 


0.2 


0.809 


0.830 


0.822 


0.872 


0.888 


0.882 


10 


50 


0.25 


0.878 


0.891 


0.886 


0.919 


0.928 


0.925 


10 


60 


0.025 


0.170 


0.191 


0.181 


0.268 


0.292 


0.284 


10 


60 


0.05 


0.338 


0.366 


0.354 


0.458 


0.486 


0.478 


10 


60 


0.1 


0.619 


0.644 


0.635 


0.717 


0.736 


0.730 


10 


60 


0.15 


0.780 


0.797 


0.791 


0.845 


0.858 


0.853 


10 


60 


0.2 


0.862 


0.875 


0.869 


0.909 


0.917 


0.914 


10 


60 


0.25 


0.917 


0.927 


0.923 


0.948 


0.953 


0.951 


10 


80 


0.025 


0.229 


0.245 


0.239 


0.332 


0.347 


0.343 


10 


80 


0.05 


0.459 


0.477 


0.469 


0.572 


0.591 


0.585 


10 


80 


0.1 


0.734 


0.748 


0.742 


0.811 


0.822 


0.819 


10 


80 


0.15 


0.877 


0.887 


0.884 


0.919 


0.925 


0.923 


10 


80 


0.2 


0.940 


0.944 


0.942 


0.961 


0.965 


0.963 


10 


80 


0.25 


0.962 


0.965 


0.964 


0.975 


0.977 


0.977 


20 


20 


0.025 


0.051 


0.199 


0.100 


0.106 


0.277 


0.178 


20 


20 


0.05 


0.096 


0.298 


0.160 


0.170 


0.391 


0.270 


20 


20 


0.1 


0.217 


0.479 


0.314 


0.329 


0.577 


0.448 



40 



20 


20 


0.15 


0.383 


0.659 


0.506 


0.520 


0.738 


0.632 


20 


20 


0.2 


0.523 


0.780 


0.638 


0.652 


0.841 


0.754 


20 


20 


0.25 


0.655 


0.862 


0.751 


0.761 


0.901 


0.841 


20 


30 


0.025 


0.098 


0.169 


0.132 


0.174 


0.259 


0.225 


20 


30 


0.05 


0.197 


0.298 


0.256 


0.306 


0.413 


0.372 


20 


30 


0.1 


0.451 


0.566 


0.514 


0.576 


0.680 


0.642 


20 


30 


0.15 


0.670 


0.765 


0.724 


0.770 


0.838 


0.815 


20 


30 


0.2 


0.798 


0.869 


0.839 


0.874 


0.918 


0.903 


20 


30 


0.25 


0.888 


0.929 


0.913 


0.932 


0.957 


0.949 


20 


40 


0.025 


0.138 


0.193 


0.170 


0.235 


0.296 


0.277 


20 


40 


0.05 


0.311 


0.379 


0.353 


0.429 


0.500 


0.478 


20 


40 


0.1 


0.627 


0.695 


0.667 


0.736 


0.786 


0.770 


20 


40 


0.15 


0.818 


0.858 


0.842 


0.885 


0.911 


0.903 


20 


40 


0.2 


0.917 


0.938 


0.930 


0.949 


0.961 


0.958 


20 


40 


0.25 


0.962 


0.972 


0.968 


0.980 


0.986 


0.984 


20 


50 


0.025 


0.189 


0.230 


0.214 


0.293 


0.342 


0.326 


20 


50 


0.05 


0.407 


0.460 


0.441 


0.534 


0.586 


0.573 


20 


50 


0.1 


0.758 


0.797 


0.781 


0.841 


0.867 


0.859 


20 


50 


0.15 


0.913 


0.928 


0.924 


0.947 


0.960 


0.956 


20 


50 


0.2 


0.968 


0.976 


0.973 


0.983 


0.988 


0.986 


20 


50 


0.25 


0.987 


0.990 


0.989 


0.993 


0.995 


0.994 


20 


60 


0.025 


0.241 


0.278 


0.266 


0.356 


0.392 


0.382 


20 


60 


0.05 


0.502 


0.547 


0.531 


0.627 


0.664 


0.654 


20 


60 


0.1 


0.843 


0.865 


0.856 


0.901 


0.914 


0.911 


20 


60 


0.15 


0.951 


0.959 


0.955 


0.971 


0.975 


0.974 


20 


60 


0.2 


0.985 


0.989 


0.988 


0.993 


0.994 


0.994 


20 


60 


0.25 


0.995 


0.996 


0.996 


0.998 


0.998 


0.998 


20 


80 


0.025 


0.341 


0.369 


0.360 


0.467 


0.497 


0.488 


20 


80 


0.05 


0.671 


0.697 


0.687 


0.768 


0.787 


0.782 


20 


80 


0.1 


0.935 


0.943 


0.942 


0.962 


0.966 


0.966 


20 


80 


0.15 


0.985 


0.988 


0.987 


0.992 


0.993 


0.993 


20 


80 


0.2 


0.997 


0.997 


0.997 


0.999 


0.999 


0.999 


20 


80 


0.25 


0.999 


0.999 


0.999 


0.999 


1.000 


0.999 


50 


20 


0.025 


0.050 


0.391 


0.139 


0.102 


0.465 


0.238 


50 


20 


0.05 


0.131 


0.577 


0.276 


0.221 


0.643 


0.412 


50 


20 


0.1 


0.388 


0.838 


0.600 


0.530 


0.877 


0.722 


50 


20 


0.15 


0.661 


0.950 


0.819 


0.771 


0.965 


0.894 


50 


20 


0.2 


0.833 


0.983 


0.925 


0.903 


0.990 


0.963 


50 


20 


0.25 


0.929 


0.996 


0.974 


0.962 


0.998 


0.988 
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50 


30 


0.025 


0.126 


0.270 


0.196 


0.211 


0.383 


0.318 


50 


30 


0.05 


0.332 


0.531 


0.446 


0.466 


0.647 


0.586 


50 


30 


0.1 


0.753 


0.876 


0.830 


0.842 


0.925 


0.900 


50 


30 


0.15 


0.937 


0.975 


0.963 


0.966 


0.988 


0.982 


50 


30 


0.2 


0.985 


0.995 


0.992 


0.993 


0.998 


0.996 


50 


30 


0.25 


0.998 


1.000 


0.999 


0.999 


1.000 


1.000 


50 


40 


0.025 


0.216 


0.327 


0.286 


0.334 


0.447 


0.415 


50 


40 


0.05 


0.537 


0.656 


0.614 


0.666 


0.760 


0.733 


50 


40 


0.1 


0.917 


0.951 


0.942 


0.954 


0.974 


0.970 


50 


40 


0.15 


0.989 


0.994 


0.993 


0.995 


0.997 


0.997 


50 


40 


0.2 


0.998 


0.999 


0.999 


0.999 


1.000 


1.000 


50 


40 


0.25 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


50 


0.025 


0.317 


0.402 


0.376 


0.443 


0.534 


0.512 


50 


50 


0.05 


0.697 


0.769 


0.748 


0.798 


0.850 


0.837 


50 


50 


0.1 


0.974 


0.982 


0.979 


0.986 


0.992 


0.991 


50 


50 


0.15 


0.998 


0.999 


0.999 


0.999 


1.000 


0.999 


50 


50 


0.2 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


50 


0.25 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


60 


0.025 


0.412 


0.480 


0.461 


0.539 


0.608 


0.592 


50 


60 


0.05 


0.814 


0.860 


0.848 


0.890 


0.917 


0.911 


50 


60 


0.1 


0.993 


0.995 


0.995 


0.996 


0.997 


0.997 


50 


60 


0.15 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


60 


0.2 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


60 


0.25 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


80 


0.025 


0.598 


0.645 


0.634 


0.720 


0.755 


0.747 


50 


80 


0.05 


0.938 


0.951 


0.948 


0.968 


0.973 


0.972 


50 


80 


0.1 


0.998 


0.999 


0.999 


0.999 


1.000 


1.000 


50 


80 


0.15 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


80 


0.2 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


50 


80 


0.25 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 



Table 9: Power of the standard chi-square based Q test and the 
improved Q test for homogeneity (gamma and chi-square with E(Q) 
degrees of freedom approximations) at the nominal 5% and 10% 
levels. I is the number of studies all of size N equally divided 
between the treatment and control arms. The effects have a mean 
of 8 = 0.5 and variance of r 2 . 
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